Biocomputing 2001 2000
DOI: 10.1142/9789814447362_0053
|View full text |Cite
|
Sign up to set email alerts
|

Scaling of Accuracy in Extremely Large Phylogenetic Trees

Abstract: The accuracy of phylogenetic inference was examined in simulated data sets up to nearly 10,000 taxa, the size of the largest set of homologous genes in existing molecular sequence databases. Even with a simple search algorithm (maximum parsimony without branch swapping), the number of characters needed to estimate 80% of a tree correctly can scale remarkably well at optimal substitution rates (on the order of log N, where N is the number of taxa). In other words, the number of taxa in an analysis can be double… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
48
0

Year Published

2002
2002
2010
2010

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 43 publications
(52 citation statements)
references
References 20 publications
4
48
0
Order By: Relevance
“…Thus, the analysis of larger data sets requires a disproportionately longer time (or disproportionately more computer resources) and/or the use of increasingly less efficient heuristic search strategies, with both factors impacting negatively on our ability to recover the best solution for that given data set. Fortunately, several studies using empirical and/or simulated data have shown that even phylogenetic analyses at the high end of the scale currently examined are both tractable and show acceptable, if not surprising, accuracy with shorter sequence lengths than might be expected [11][12][13] , thereby reinforcing some theoretical work in the latter area 14,15 . Additionally, advances in computer technology and architecture such as parallel and distributed computing and programs that exploit them efficiently in combination with the continual development of faster search strategies promise to make even larger phylogenetic problems increasingly tractable.…”
Section: Introductionmentioning
confidence: 92%
See 4 more Smart Citations
“…Thus, the analysis of larger data sets requires a disproportionately longer time (or disproportionately more computer resources) and/or the use of increasingly less efficient heuristic search strategies, with both factors impacting negatively on our ability to recover the best solution for that given data set. Fortunately, several studies using empirical and/or simulated data have shown that even phylogenetic analyses at the high end of the scale currently examined are both tractable and show acceptable, if not surprising, accuracy with shorter sequence lengths than might be expected [11][12][13] , thereby reinforcing some theoretical work in the latter area 14,15 . Additionally, advances in computer technology and architecture such as parallel and distributed computing and programs that exploit them efficiently in combination with the continual development of faster search strategies promise to make even larger phylogenetic problems increasingly tractable.…”
Section: Introductionmentioning
confidence: 92%
“…The simulation protocol used was modelled on that followed by Bininda-Emonds et al 13 to examine the scaling of accuracy in very large phylogenetic trees. For each run, a model tree of 4,096 taxa was generated according to a stochastic Yule birth process using the default parameters of the YULE_C procedure in the program r8s v1.60 25 .…”
Section: S Imulation P Rotocolmentioning
confidence: 99%
See 3 more Smart Citations