2019
DOI: 10.48550/arxiv.1910.09020
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs

Abstract: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capacity of existing algorithms and compute infrastructures. Calculating the similarities between a pair of genomic sequences is one of the most fundamental computational steps in genomic analysis. This step -called sequence alignment-is formulated as an approximate string matching (ASM) problem, which is typically solved using computationally expensive dynamic programming algorithms. In this work, we introduce Sne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
17
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(20 citation statements)
references
References 30 publications
(58 reference statements)
3
17
0
Order By: Relevance
“…Several recent works propose approaches and techniques to directly or indirectly accelerate or improve the accuracy of metagenomics pro ling, the rst step of such studies. ese works take three approaches: (1) Reducing the reference database's size by pre-alignment ltering [86,87] or heuristics for taxonomic classi cation techniques [55,[88][89][90][91], (2) Accelerating read alignment or assembly (only for alignment-/assembly-based pro lers) on CPUs, FPGAs, or GPUs [92][93][94][95][96][97][98], (3) post-alignment/-assembly/-classi cation presence and abundance estimation heuristics [54,55,99]. Demeter is categorized in the rst group, taking a HDC-based approach for the rst time.…”
Section: Metagenomic Pro Lersmentioning
confidence: 99%
“…Several recent works propose approaches and techniques to directly or indirectly accelerate or improve the accuracy of metagenomics pro ling, the rst step of such studies. ese works take three approaches: (1) Reducing the reference database's size by pre-alignment ltering [86,87] or heuristics for taxonomic classi cation techniques [55,[88][89][90][91], (2) Accelerating read alignment or assembly (only for alignment-/assembly-based pro lers) on CPUs, FPGAs, or GPUs [92][93][94][95][96][97][98], (3) post-alignment/-assembly/-classi cation presence and abundance estimation heuristics [54,55,99]. Demeter is categorized in the rst group, taking a HDC-based approach for the rst time.…”
Section: Metagenomic Pro Lersmentioning
confidence: 99%
“…To avoid examining dissimilar sequences at the downstream computationally-expensive read alignment step, a pre-alignment lter estimates the edit distance between every read and the regions of the reference at each read's candidate mapping locations, and uses this estimation to quickly decide whether or not read alignment is needed. If the sequences are dissimilar enough, signi cant amount of time is saved by avoiding the expensive alignment step [9,10,13,176,177].…”
Section: Genasm Frameworkmentioning
confidence: 99%
“…Examples of such lters are the Adjacency Filter [177] that is implemented for standard CPUs, SHD [176] that uses SIMD-capable CPUs, and GRIM-Filter [91] that is built in 3D-stacked memory. Many works also exploit the large amounts of parallelism o ered by FPGA architectures for pre-alignment ltering, such as Gate-Keeper [10], MAGNET [11], Shouji [9], and SneakySnake [13]. A recent work, GenCache [122], proposes an in-cache accelerator to improve the ltering (i.e., seeding) mechanism of GenAx (for short reads) by using in-cache operations [1] and software modi cations.…”
Section: Related Workmentioning
confidence: 99%
“…This backtracking step involves irregular memory access patterns that are challenging for hardware implementation. Second, a few works [17,18] propose a filtering step before alignment, called pre-alignment filtering 1 , to significantly speed up the end-to-end sequence alignment of (long) reads by heuristically replacing the need for expensive DP solutions for many inputs in the first place. These filters use a pre-defined edit distance threshold between the inputs and quickly determine whether or not an alignment (i.e., DP) should be granted.…”
Section: Introductionmentioning
confidence: 99%