2022
DOI: 10.1101/2022.11.20.517297
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers

Abstract: Nanopore sequencing is a widely-used high-throughput genome sequencing technology that can sequence long fragments of a genome. Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases (i.e., A, C, G, T) using a computational step called basecalling. The accuracy and speed of basecalling have critical implications for every subsequent step in genome analysis. Currently, basecallers are mainly based on deep learning techniques to provide hig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(18 citation statements)
references
References 116 publications
(216 reference statements)
2
16
0
Order By: Relevance
“…One of the immediate steps after generating raw nanopore signals is their translation to their corresponding DNA bases as sequences of characters with a computationallyintensive step, basecalling. Basecalling approaches are usually computationally costly and consume significant energy as they use complex deep learning models [26][27][28][29][30][31][32][33][34][35][36][37][38]. Although we do not evaluate in this work, we expect that RawHash can be used as a low-cost filter to eliminate the reads that are unlikely to be useful in downstream analysis, which can reduce the overall workload of basecallers and further downstream analysis.…”
Section: Discussionmentioning
confidence: 99%
“…One of the immediate steps after generating raw nanopore signals is their translation to their corresponding DNA bases as sequences of characters with a computationallyintensive step, basecalling. Basecalling approaches are usually computationally costly and consume significant energy as they use complex deep learning models [26][27][28][29][30][31][32][33][34][35][36][37][38]. Although we do not evaluate in this work, we expect that RawHash can be used as a low-cost filter to eliminate the reads that are unlikely to be useful in downstream analysis, which can reduce the overall workload of basecallers and further downstream analysis.…”
Section: Discussionmentioning
confidence: 99%
“…Modern deep learning-based basecallers [20, 21, 36, 38, 39, 48, 5254] incorporate skip connections to help mitigate the vanishing gradient and saturation problems [55]. Removing skip connections has a higher impact on basecalling accuracy.…”
Section: Methodsmentioning
confidence: 99%
“…However, these deep learning models use millions of model parameters that makes basecalling computationally extensive. Recent works propose algorithmic optimizations [33, 48] and hardware accelerators [85] to improve the performance of basecallers. These works accelerate the basecalling step without eliminating the wasted computation in basecalling.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations