2018
DOI: 10.1093/molbev/msy127
|View full text |Cite
|
Sign up to set email alerts
|

Relative Evolutionary Rates in Proteins Are Largely Insensitive to the Substitution Model

Abstract: The relative evolutionary rates at individual sites in proteins are informative measures of conservation or adaptation. Often used as evolutionarily aware conservation scores, relative rates reveal key functional or strongly selected residues. Estimating rates in a phylogenetic context requires specifying a protein substitution model, which is typically a phenomenological model trained on a large empirical data set. A strong emphasis has traditionally been placed on selecting the “best-fit” model, with the imp… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(7 citation statements)
references
References 55 publications
0
7
0
Order By: Relevance
“…Previous studies 2528,31 and the analyses conducted here revealed little impact of using alternative models on the accuracy of tree topologies. While our results were demonstrated for phylogenetic reconstruction and ancestral sequence reconstruction, evidence for the robustness of inference to the model employed was also shown for the estimation of relative evolutionary rates across proteins alignment sites 41 , and for the inference of the evolutionary relationships when quartets are concerned 32 . We speculate that our conclusion should also hold for other tasks such as finding orthologous sequences, detecting horizontal gene transfer events, and the detection of conserved regions.…”
Section: Discussionmentioning
confidence: 55%
“…Previous studies 2528,31 and the analyses conducted here revealed little impact of using alternative models on the accuracy of tree topologies. While our results were demonstrated for phylogenetic reconstruction and ancestral sequence reconstruction, evidence for the robustness of inference to the model employed was also shown for the estimation of relative evolutionary rates across proteins alignment sites 41 , and for the inference of the evolutionary relationships when quartets are concerned 32 . We speculate that our conclusion should also hold for other tasks such as finding orthologous sequences, detecting horizontal gene transfer events, and the detection of conserved regions.…”
Section: Discussionmentioning
confidence: 55%
“…While it may be possible to ameliorate the influence of MSA uncertainty on relative model selection, we must also ask: Do we need to mitigate this issue in the first place? For example, recent studies have shown that, for both nucleotide and amino-acid models, the model selection procedure itself may not be a critical step in phylogenetic reconstruction, since different models with extreme differences in relative fit may not actually result in systematically different results [ 2 , 31 , 33 ] although how the precise model used may influence branch length and/or divergence estimation remains an important question [ 1 , 2 ]. As such, if distinct models may yield highly similar inferences, optimizing the model selection procedure itself has diminishing returns.…”
Section: Discussionmentioning
confidence: 99%
“…One of the most common approaches used to identify a suitable model for phylogenetic inference is relative model selection, wherein a set of candidate models are ranked according to a given goodness-of-fit measurement, and the best-fitting model is then used in the phylogenetic reconstruction [ 34 ]. Although recent studies have suggested that relative model selection may not be a critical step in phylogenetic studies [ 2 , 31 , 33 ], it remains an enduring staple of most analysis pipelines. Henceforth, we use the phrase “model selection” to refer specifically to relative model selection, unless otherwise stated.…”
Section: Introductionmentioning
confidence: 99%
“…Multiple sequence alignments and their associated consensus trees were used as inputs and evaluated under a sixteen-category gamma-distributed model. To more directly measure the values of interest (i.e., the relative site-wise rates of amino acid residue replacement), and in consideration of recent developments in the field [ 16 , 17 ], site rates were scored based on the equal-probability matrix proposed by Jukes and Cantor [ 18 ] rather than the default matrix proposed by Jones et al [ 19 ]. We used the empirical Bayesian method of rate inference implemented in Rate4site, and site rates were normalized as z-scores with mean = 0.0 so that in all alignments, positive scores indicated faster sites while negative scores indicated slower sites.…”
Section: Methodsmentioning
confidence: 99%