2013
DOI: 10.1093/sysbio/syt057
|View full text |Cite
|
Sign up to set email alerts
|

Poor Fit to the Multispecies Coalescent is Widely Detectable in Empirical Data

Abstract: Model checking is a critical part of Bayesian data analysis, yet it remains largely unused in systematic studies. Phylogeny estimation has recently moved into an era of increasingly complex models that simultaneously account for multiple evolutionary processes, the statistical fit of these models to the data has rarely been tested. Here we develop a posterior predictive simulation-based model check for a commonly used multispecies coalescent model, implemented in *BEAST, and apply it to 25 published data sets.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
102
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
6
1
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 79 publications
(102 citation statements)
references
References 77 publications
0
102
0
Order By: Relevance
“…The coalescent inference is based on the assumption that the multispecies coalescent model is a good approximation to the real biological process that causes incongruent gene trees. There are preliminary attempts to assess the goodness of fit of the multispecies coalescent model, but they were either limited to small data sets (Reid et al, 2014) or they were based on the distance between gene trees and the species tree (Song et al, 2012), which may not have the power to reject the multispecies coalescent model. Reid et al (2014) evaluated the multispecies coalescent model in a Bayesian framework using posterior predictive simulation (PPS), in which the estimated gene trees were compared with the predictive distribution of gene trees.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The coalescent inference is based on the assumption that the multispecies coalescent model is a good approximation to the real biological process that causes incongruent gene trees. There are preliminary attempts to assess the goodness of fit of the multispecies coalescent model, but they were either limited to small data sets (Reid et al, 2014) or they were based on the distance between gene trees and the species tree (Song et al, 2012), which may not have the power to reject the multispecies coalescent model. Reid et al (2014) evaluated the multispecies coalescent model in a Bayesian framework using posterior predictive simulation (PPS), in which the estimated gene trees were compared with the predictive distribution of gene trees.…”
Section: Discussionmentioning
confidence: 99%
“…There are preliminary attempts to assess the goodness of fit of the multispecies coalescent model, but they were either limited to small data sets (Reid et al, 2014) or they were based on the distance between gene trees and the species tree (Song et al, 2012), which may not have the power to reject the multispecies coalescent model. Reid et al (2014) evaluated the multispecies coalescent model in a Bayesian framework using posterior predictive simulation (PPS), in which the estimated gene trees were compared with the predictive distribution of gene trees. Since the predictive distribution of gene trees is generated from a Bayesian coalescent approach (i.e., Ã BEAST), PPS is not able to evaluate the multispecies coalescent model for genome-scale sequence data.…”
Section: Discussionmentioning
confidence: 99%
“…Indeed, simulation studies have shown that combining data can provide high statistical support for an incorrect species tree because of processes potentially leading to incongruence among data partition, including incomplete lineage sorting, gene duplications/extinctions (paralogy), non-neutral evolution and hybridization (see ref. 42 for review), questioning the use of a total evidence approach [43][44][45] . Incongruence was operationally defined as the presence of incompatible bipartitions found in branches supported with bootstrap proportions 495% from separate analyses of two data sets.…”
Section: Methodsmentioning
confidence: 99%
“…Considering that our focal group is intensively studied and exceptionally well represented in research, museum, and private collections due to its aesthetic appeal , it would be considerably more challenging to obtain a complete sampling of many other groups. Another difficulty stems from the fact that the advanced coalescent techniques like BUCKy and *BEAST perform best with multiple samples per species, which should capture intraspecific diversity, and may require complete taxon coverage (Heled and Drummond 2010;Reid et al 2013;Steel and Velasco 2014). Much of the uncertainty in the estimates can be attributed to missing data, which can negatively affect the estimation of both individual gene trees and the encompassing species tree (Wiens and Morrill 2011;Roure et al 2013).…”
Section: Optimal Samplingmentioning
confidence: 99%