2014
DOI: 10.1093/sysbio/syu106
|View full text |Cite
|
Sign up to set email alerts
|

Genetic Distance for a General Non-Stationary Markov Substitution Process

Abstract: The genetic distance between biological sequences is a fundamental quantity in molecular evolution. It pertains to questions of rates of evolution, existence of a molecular clock, and phylogenetic inference. Under the class of continuous-time substitution models, the distance is commonly defined as the expected number of substitutions at any site in the sequence. We eschew the almost ubiquitous assumptions of evolution under stationarity and time-reversible conditions and extend the concept of the expected num… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

1
64
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 22 publications
(67 citation statements)
references
References 46 publications
1
64
0
Order By: Relevance
“…For example, LogDet or paralinear distances have been proposed (Felsenstein 2004) which correct for variation in base composition across a phylogeny. A general nonstationary Markov model has also been proposed that accounts for changes in the substitution process across a tree when calculating pairwise genetic distances among taxa (Kaehler et al 2015). The most flexible option for dealing with process heterogeneity across sites or taxa may be to use patristic distances calculated from branch lengths in a phylogeny for DNA barcoding studies.…”
Section: Discussionmentioning
confidence: 99%
“…For example, LogDet or paralinear distances have been proposed (Felsenstein 2004) which correct for variation in base composition across a phylogeny. A general nonstationary Markov model has also been proposed that accounts for changes in the substitution process across a tree when calculating pairwise genetic distances among taxa (Kaehler et al 2015). The most flexible option for dealing with process heterogeneity across sites or taxa may be to use patristic distances calculated from branch lengths in a phylogeny for DNA barcoding studies.…”
Section: Discussionmentioning
confidence: 99%
“…We have obtained that K80 DLC matrices are embeddable if and only if its principal logarithm is a rate matrix (Remark 4.9). In parameter estimation in phylogenetics, it might be relevant to restrict to a subset of matrices where identifiability of the substitution parameters is guaranteed (for a discussion see Zou et al, 2011 andKaehler et al, 2015). The set of all DLC matrices is one of these subsets (see Chang, 1996).…”
Section: Discussionmentioning
confidence: 99%
“…In the simplest sense, our ability 535 to draw inference relies on how well these models represent the process of neutral 536 sequence evolution. As we demonstrated previously (Kaehler et al, 2015), utilising 537 time-reversible nucleotide substitution processes distorts our estimation of the number of 538 events in a manner that is proportional to the extent of non-stationarity. Non-stationarity 539 is common across the tree of life (e.g.…”
mentioning
confidence: 84%
“…For non-stationary models, such as GNC, τ does not necessarily equal the expected number of substitutions (Kaehler et al, 2015). Rather, the genetic distance can be calculated as…”
mentioning
confidence: 99%
See 1 more Smart Citation