Linear-time superbubble identification algorithm for genome assembly
- Submitting institution
-
King's College London
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 86514922
- Type
- D - Journal article
- DOI
-
10.1016/j.tcs.2015.10.021
- Title of journal
- Theoretical Computer Science
- Article number
- -
- First page
- 374
- Volume
- 609, Part 2
- Issue
- -
- ISSN
- 0304-3975
- Open access status
- Out of scope for open access requirements
- Month of publication
- October
- Year of publication
- 2015
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
5
- Research group(s)
-
-
- Citation count
- 9
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- Whole-genome sequencing techniques produce masses of data in the form of reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilise de Bruijn graphs constructed from these reads . A critical step of these algorithms is the detection of motif structures in the graph caused by sequencing errors/repeats. This paper was a breakthrough, the first linear algorithm to detect these errors. The techniques/data structures used are novel. The algorithms are practical (experimentally shown), and our software (https://github.com/Ritu-Kundu/Superbubbles) and its extensions to other forms of graphs have been implemented in saboten (https://crates.io/crates/saboten).
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -