2016
DOI: 10.1186/s13015-016-0066-8
|View full text |Cite
|
Sign up to set email alerts
|

Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage

Abstract: BackgroundHigh throughput sequencing technologies have become fast and cheap in the past years. As a result, large-scale projects started to sequence tens to several thousands of genomes per species, producing a high number of sequences sampled from each genome. Such a highly redundant collection of very similar sequences is called a pan-genome. It can be transformed into a set of sequences “colored” by the genomes to which they belong. A colored de Bruijn graph (C-DBG) extracts from the sequences all colored … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
63
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 75 publications
(64 citation statements)
references
References 21 publications
0
63
0
Order By: Relevance
“…The resulting structure is usually referred to as a colored de Bruijn graph [19] and its representations have been widely studied ( [50][51][52][53][54][55][56][57][58][59][60][61] ). Even though we touched this setting in the section Multiple pan-genomes, exploiting the similarity between individual de Bruijn graphs for further compression in simplitig-based approaches is to be addressed in future work.…”
Section: Discussionmentioning
confidence: 99%
“…The resulting structure is usually referred to as a colored de Bruijn graph [19] and its representations have been widely studied ( [50][51][52][53][54][55][56][57][58][59][60][61] ). Even though we touched this setting in the section Multiple pan-genomes, exploiting the similarity between individual de Bruijn graphs for further compression in simplitig-based approaches is to be addressed in future work.…”
Section: Discussionmentioning
confidence: 99%
“…The second is BOSS, which, as mentioned previously, was shown [35] to have superior space usage. We did not compare against the Bloom filter trie [36], which is fast but uses an order of Table 2: Space usage of UST-Compress and others. We show the average number of bits per distinct k-mer in the dataset.…”
Section: Evaluation Of Ust-fmmentioning
confidence: 99%
“…Membership data structures for k-mer sets were surveyed in a recent paper [9]. In addition to the unitig-based approaches already mentioned, other exact representations include succinct de Bruijn graphs (referred to as BOSS [36]) and their variations [37,38], dynamic de Bruijn graphs [39,40], and Bloom filter tries [41]. Some data structures are non-static, i.e.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Hence, this allows the matrix to be compressed and stored independently of the graph. Also several other methods have been developed to further compress, store, and manipulate the color matrix, including Rainbowfish (Almodaresi et al, 2017), Mantis (Pandey et al, 2018), Bloom Filter Trie (BFT) (Holley et al, 2016), and Bifrost (Holley and Melsted, 2019).…”
Section: Introductionmentioning
confidence: 99%