Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics 2014
DOI: 10.1145/2649387.2649436
|View full text |Cite
|
Sign up to set email alerts
|

Strand

Abstract: The Super Threaded Reference-Free Alignment-Free Nsequence Decoder (Strand) is a highly parallel technique for the learning and classification of gene sequence data into any number of associated categories or gene sequence taxonomies. Current methods, including the state-of-the-art sequence classification method RDP, balance performance by using a shorter word length. Strand in contrast uses a much longer word length, and does so efficiently by implementing a Divide and Conquer algorithm leveraging MapReduce s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 21 publications
(20 reference statements)
0
1
0
Order By: Relevance
“…The other phyla had under 15 genomes per phyla, with Verrucomicrobiota having only a single genome. Using the 16S sequences from each genome, we calculated the MASH distance between every pair of genomes, which accounted for multiple unique 16S genes per genome [31, 32]. Core genes were identified using the UBCG2 database, resulting in 34,051,278 genes across the database, roughly averaging to 70 genes per genome.…”
Section: Resultsmentioning
confidence: 99%
“…The other phyla had under 15 genomes per phyla, with Verrucomicrobiota having only a single genome. Using the 16S sequences from each genome, we calculated the MASH distance between every pair of genomes, which accounted for multiple unique 16S genes per genome [31, 32]. Core genes were identified using the UBCG2 database, resulting in 34,051,278 genes across the database, roughly averaging to 70 genes per genome.…”
Section: Resultsmentioning
confidence: 99%
“…The other phyla had under 15 genomes per phyla, with Verrucomicrobiota having only a single genome. Using the 16S sequences from each genome, we calculated the MASH distance between every pair of genomes, which accounted for multiple unique 16S genes per genome ( 30 , 31 ). Core genes were identified using the UBCG2 database, resulting in 34,051,278 genes across the database, roughly averaging 70 genes per genome.…”
Section: Resultsmentioning
confidence: 99%