2010
DOI: 10.1089/cmb.2010.0056
|View full text |Cite
|
Sign up to set email alerts
|

Alignment-Free Sequence Comparison (II): Theoretical Power of Comparison Statistics

Abstract: Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D 2 , which counts the number of matching k-tuples between two sequences, as well as D 2 * , which uses centralized counts, and D 2 S , which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
139
1

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 113 publications
(143 citation statements)
references
References 16 publications
3
139
1
Order By: Relevance
“…Such techniques were proposed as a fast alternative to much more time-consuming alignment methods, but at the expense of accuracy. Some detailed reviews of k-mer algorithms for sequence comparison (as well as others approaches based on information theory) were presented by Vinga et al [17], Reinert et al [16] and Wan et al [18]. The main idea of using k-mers in sequences comparison usually boils down to two stages.…”
Section: The Use Of K-mer In Biological Sequence Comparisonmentioning
confidence: 99%
“…Such techniques were proposed as a fast alternative to much more time-consuming alignment methods, but at the expense of accuracy. Some detailed reviews of k-mer algorithms for sequence comparison (as well as others approaches based on information theory) were presented by Vinga et al [17], Reinert et al [16] and Wan et al [18]. The main idea of using k-mers in sequences comparison usually boils down to two stages.…”
Section: The Use Of K-mer In Biological Sequence Comparisonmentioning
confidence: 99%
“…To this end, theoretical aspects of k-mer statistics for biological sequence comparison have been studied in detail before. [7][8][9][10] Previous work has shown a similar approach of studying optimal length of peptides shared among close homologues for ultra-fast protein searches. 11 As also shown previously, simple oligonucleotide or peptide overlap measures between two genomes can be indicative of their phylogenetic distance.…”
Section: Determination Of Distance Measurementioning
confidence: 99%
“…Certain other AF methods may fit uncomfortably into these classes, or lie outside them altogether 30 . In the present context, the motivating concept is the same: substrings (perhaps defined by k-mers) that meet certain criteria, and are shared by a set of sequences, can be considered as capturing part of the homology signal and are thus potentially informative on phylogeny.…”
Section: Alignment-free Methods and K-mersmentioning
confidence: 99%
“…For instance, regions of ribosomal RNA (16S or 23S for bacteria, 18S or 28S for eukaryotes) are more highly conserved compared to genes coding for metabolic functions 30,31 . For a given protein however the evolutionary rate can be modelled as constant across evolutionary time and among the different lineages.…”
Section: Phylogenetic Treesmentioning
confidence: 99%
See 1 more Smart Citation