2014
DOI: 10.1371/journal.pone.0095722
|View full text |Cite
|
Sign up to set email alerts
|

The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales

Abstract: Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while sub… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
35
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 53 publications
(37 citation statements)
references
References 50 publications
1
35
0
1
Order By: Relevance
“…These values differ from those estimated for other species. The number of conserved sites in the human genome is ∼8% (48), and, in other viruses, the proportion is much higher, including 66% for HIV (49), 63% for influenza A virus (50), and 78% for rubella virus (51). These trends suggest that the number of invariant sites is inversely correlated with genome size.…”
Section: Discussionmentioning
confidence: 99%
“…These values differ from those estimated for other species. The number of conserved sites in the human genome is ∼8% (48), and, in other viruses, the proportion is much higher, including 66% for HIV (49), 63% for influenza A virus (50), and 78% for rubella virus (51). These trends suggest that the number of invariant sites is inversely correlated with genome size.…”
Section: Discussionmentioning
confidence: 99%
“…The best‐fitting substitution model across all partitions for the nucleotide dataset was a general time‐reversible substitution model (GTR; Tavaré, ) with rate heterogeneity described by a gamma distribution discretized into four bins (+G; Yang, ) and a proportion of invariant sites (+I, Fitch & Margoliash, ). We did not use the GTR + I + G mixture model (Gu et al ., ; Waddell & Steel, ) because this approach has been highly criticized on both empirical and theoretical grounds (Yang, , , ; Sullivan et al ., ; Mayrose et al ., ; Jia et al ., ). Studies indicate that some of the parameters of the +I and +G models cannot be optimized independently of each other (Yang, , ; Jia et al ., ); indeed, the estimated proportion of invariable sites was demonstrated to be highly susceptible to changes in the number of gamma rate categories of the +G model (Jia et al ., ).…”
Section: Methodsmentioning
confidence: 97%
“…We did not use the GTR + I + G mixture model (Gu et al ., ; Waddell & Steel, ) because this approach has been highly criticized on both empirical and theoretical grounds (Yang, , , ; Sullivan et al ., ; Mayrose et al ., ; Jia et al ., ). Studies indicate that some of the parameters of the +I and +G models cannot be optimized independently of each other (Yang, , ; Jia et al ., ); indeed, the estimated proportion of invariable sites was demonstrated to be highly susceptible to changes in the number of gamma rate categories of the +G model (Jia et al ., ). Furthermore, it has been suggested that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates, and little to no biological meaning, especially at the intraspecific level (Jia et al ., ).…”
Section: Methodsmentioning
confidence: 97%
See 1 more Smart Citation
“…This implies that the standard practice of model selection before phylogeny inference is unnecessary when a complex and parameter-rich model is applied to nucleotide data (Abadi et al, 2019). However, it is worth mentioning that combining a gamma distribution (+G; Yang, 1993) with a proportion of invariant sites (+I; Fitch & Margoliash, 1967) to account for among-site rate heterogeneity has been highly criticized, mainly because some of the parameters of the +I and + G models cannot be optimized independently of each other (Yang, 2006;Sullivan, 1999;Mayrose, 2005;Jia et al, 2014). Thus, it is more advisable to apply a GTR + G mixture model for each partition, as opposed to a GTR + I + G.…”
Section: Describing the Process Of Molecular Evolution: Model Selectimentioning
confidence: 99%