Paper
1 October 2018 Scaffolding algorithm using second- and third-generation reads
Author Affiliations +
Proceedings Volume 10808, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018; 108083A (2018) https://doi.org/10.1117/12.2501505
Event: Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018, 2018, Wilga, Poland
Abstract
The second generation sequencing methods produce high-quality short reads, which are assembled into contigs by DNA assemblers. Due to the fact that length of a single read is limited to 500bp it is really hard to assembly full genomes or full chromosomes. Generating longer contigs with low cost of sequencing is a main effort of computer scientists in this area. We propose to link contings created from second-generation reads using reads from third-generation sequencers. Such reads have length 10-20kbp. An existing implementation of this approach appears to be time and memory demanding for larger genomes. We developed an algorithm based on Bloom filter and extremely memory-efficient associative array. Our implementation remarkably exceeds the previous one in terms of time and memory consumption. Presented algorithm, provided as a shared library, is a part of the dnaasm de-novo assembler. The library has been created using C++ programming language, Boost and Google Sparse Hash libraries. Both web browser-based graphical user interface and command line interface are provided. Source code as well as a demo web application and a docker image are available at the dnaasm project web-page: http://dnaasm.sourceforge.net. Our application has been tested on real data of bacteria, yeast and plant genomes.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Wiktor Franus, Wiktor Kuśmirek, and Robert M. Nowak "Scaffolding algorithm using second- and third-generation reads", Proc. SPIE 10808, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018, 108083A (1 October 2018); https://doi.org/10.1117/12.2501505
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Associative arrays

Algorithm development

C++

Computer programming

Computer science

Bioinformatics

Back to Top