2004
DOI: 10.1093/molbev/msh112
|View full text |Cite
|
Sign up to set email alerts
|

A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process

Abstract: Most current models of sequence evolution assume that all sites of a protein evolve under the same substitution process, characterized by a 20 x 20 substitution matrix. Here, we propose to relax this assumption by developing a Bayesian mixture model that allows the amino-acid replacement pattern at different sites of a protein alignment to be described by distinct substitution processes. Our model, named CAT, assumes the existence of distinct processes (or classes) differing by their equilibrium frequencies ov… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

17
1,220
1

Year Published

2008
2008
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 1,350 publications
(1,241 citation statements)
references
References 48 publications
17
1,220
1
Order By: Relevance
“…Using 81 curated sequences, multiple sequence alignments were inferred using MSAProbs (Liu, Schmidt, & Maskell, 2010) and MUSCLE (Edgar, 2004) with the default settings. Both of these alignments were best‐fit by the PROTCATWAG model (Lartillot & Philippe, 2004; Le & Gascuel, 2008), with model fitness assessed using the Akaike information criterion (Abascal, Zardoya, & Posada, 2005). The 81 protein sequences used in this study are available to download from the following URL: http://www.phylobot.com/582058404/RuBisCO.noalign.fasta …”
Section: Methodsmentioning
confidence: 99%
“…Using 81 curated sequences, multiple sequence alignments were inferred using MSAProbs (Liu, Schmidt, & Maskell, 2010) and MUSCLE (Edgar, 2004) with the default settings. Both of these alignments were best‐fit by the PROTCATWAG model (Lartillot & Philippe, 2004; Le & Gascuel, 2008), with model fitness assessed using the Akaike information criterion (Abascal, Zardoya, & Posada, 2005). The 81 protein sequences used in this study are available to download from the following URL: http://www.phylobot.com/582058404/RuBisCO.noalign.fasta …”
Section: Methodsmentioning
confidence: 99%
“…Importantly, the strong statistical support for the monophyly of Olfactores was unaffected by taxon addition (Bourlat et al, 2006). Models accounting for site-specific modulations of the amino-acid replacement process, such as the CAT mixture model (Lartillot and Philippe, 2004), seem to offer a significantly better fit to real data than empirical substitution matrices currently used in standard models of amino-acid sequence evolution. Accounting for sitespecific amino-acid propensities has also been shown to induce a significant improvement of phylogenetic reconstruction in difficult cases such as long-branch attraction (Baurain et al, 2007;Lartillot et al, 2007;Lartillot and Philippe, 2008).…”
Section: Effect Of An Improved Model Of Sequence Evolutionmentioning
confidence: 99%
“…Bayesian phylogenetic analyses of the two phylogenomic datasets were conducted using the program PHY-LOBAYES 2.3c (http://www.atgc-montpellier.fr/phylobayes/) under the CAT1G 4 site-heterogeneous mixture model (Lartillot and Philippe, 2004). For each dataset, four independent Monte Carlo Markov Chains (MCMCs) starting from a random topology were run in parallel for 20,000 cycles (1,500,000 generations), saving a point every cycle, and discarding the first 2,000 points as the burnin.…”
Section: Phylogenomic Analysesmentioning
confidence: 99%
“…For concatenated data, we explored models with and without proportional branch lengths across subsets suggested by PartitionFinder. Under the CAT model (Lartillot & Philippe 2004), substitution rates are constant across sites and trees, whereas state frequencies are treated as a Dirichlet process with an infinite number of mixtures across sites, unobserved states at each site being united into a single state (Lartillot et al 2007). We used default priors, except that the prior on branch lengths was set to an exponential with a mean seeded by an exponential hyperprior with mean 0Á1.…”
Section: Phylogenetic Analysesmentioning
confidence: 99%