2014
DOI: 10.1093/bioinformatics/btu132
|View full text |Cite
|
Sign up to set email alerts
|

Turtle: Identifying frequent k -mers with cache-efficient algorithms

Abstract: The tools are freely available for download at http://bioinformatics.rutgers.edu/Software/Turtle and http://figshare.com/articles/Turtle/791582.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 61 publications
(37 citation statements)
references
References 21 publications
0
37
0
Order By: Relevance
“…Numerous software packages can organize the raw sequencing data of each individual into comprehensive k -mer lists 28, 3134 , which can be later used for fast retrieval of k -mer counts. However, the compilation of full-genome lists is somewhat inefficient if the lists are only used once and then immediately deleted.…”
Section: Discussionmentioning
confidence: 99%
“…Numerous software packages can organize the raw sequencing data of each individual into comprehensive k -mer lists 28, 3134 , which can be later used for fast retrieval of k -mer counts. However, the compilation of full-genome lists is somewhat inefficient if the lists are only used once and then immediately deleted.…”
Section: Discussionmentioning
confidence: 99%
“…Both software were run for 4 values of k simultaneously, k=31; 47; 63; 79 using four threads for parallelism. As a comparison running a fast k-mer-counter, scTurtle (Roy et al, 2014) took 3594s using eight cores and 26 G of memory, for a single value of k = 31 for B. Impatiens.…”
Section: Comparison To Kmergeniementioning
confidence: 99%
“…Much work has been done on reducing memory requirements, based on exact or approximately correct methods of keeping track of a large set of k-mers, this work includes using succinct set representations (Conway and Bromage, 2011) or probabilistic encodings such as Bloom filters (Chikhi and Rizk, 2012;Melsted and Pritchard, 2011;Pell et al, 2012), whereas recent advances have focused on more speed (Deorowicz et al, 2013;Roy et al, 2014). Although the impact on memory usage is considerable, compared to previous approaches, these methods require storing all k-mers, explicitly or implicitly, in memory.…”
Section: Introductionmentioning
confidence: 99%
“…Other tools like DSK [7] and KMC [8] exploit a two-disk architecture and aim at reducing expensive IO operations. Turtle [9] replaces a standard Bloom filter by a cache-efficient counterpart. MSPKmerCounter [10] introduces the concept of minimizers to the  k -mer counting, thus further optimizing the disk-based approach.…”
Section: Introductionmentioning
confidence: 99%