2021
DOI: 10.1093/bioinformatics/btab428
|View full text |Cite
|
Sign up to set email alerts
|

wQFM: highly accurate genome-scale species tree estimation from weighted quartets

Abstract: Motivation Species tree estimation from genes sampled from throughout the whole genome is complicated due to the gene tree-species tree discordance. Incomplete lineage sorting (ILS) is one of the most frequent causes for this discordance, where alleles can coexist in populations for periods that may span several speciation events. Quartet-based summary methods for estimating species trees from a collection of gene trees are becoming popular due to their high accuracy and statistical guarantee… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

5
37
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 19 publications
(42 citation statements)
references
References 90 publications
5
37
0
Order By: Relevance
“… Allman et al (2011) (here given as ADR) provided one of the fundamental theorems underlying species tree estimation under the MSC: they proved that for any four species, the unrooted topology of the species tree is the same as the topology of the most probable unrooted gene tree. This theorem has been used to establish statistical consistency for quartet-based methods, such as ASTRAL, wQFM ( Mahbub et al , 2021 ) and the population tree in BUCKy ( Larget et al , 2010 ). Interestingly, Theorem 9 from ADR has received much less attention and (to the best of our knowledge) is not used in any species tree estimation method.…”
Section: Introductionmentioning
confidence: 99%
“… Allman et al (2011) (here given as ADR) provided one of the fundamental theorems underlying species tree estimation under the MSC: they proved that for any four species, the unrooted topology of the species tree is the same as the topology of the most probable unrooted gene tree. This theorem has been used to establish statistical consistency for quartet-based methods, such as ASTRAL, wQFM ( Mahbub et al , 2021 ) and the population tree in BUCKy ( Larget et al , 2010 ). Interestingly, Theorem 9 from ADR has received much less attention and (to the best of our knowledge) is not used in any species tree estimation method.…”
Section: Introductionmentioning
confidence: 99%
“…Importantly, ASTRAL operates directly on the input set of k gene trees instead of explicitly constructing a set of Ω( n 4 ) weighted quartets, where n is the number of species (also referred to as taxa). This is in stark contrast to other popular heuristics for MQSST, namely Weighted Quartet Max Cut (wQMC; Avni et al, 2014) and Weighted Quartet Fiduccia-Mattheyses (wQFM; Mahbub et al, 2021), which take a set of weighted quartets as input and construct the species tree in a divide-and-conquer fashion, for example via graph cuts. When considering the time to weight quartets based on the gene trees, both wQFM and wQMC can be far more computationally intensive than ASTRAL (Mahbub et al, 2021).…”
Section: Introductionmentioning
confidence: 92%
“…This is in stark contrast to other popular heuristics for MQSST, namely Weighted Quartet Max Cut (wQMC; Avni et al, 2014) and Weighted Quartet Fiduccia-Mattheyses (wQFM; Mahbub et al, 2021), which take a set of weighted quartets as input and construct the species tree in a divide-and-conquer fashion, for example via graph cuts. When considering the time to weight quartets based on the gene trees, both wQFM and wQMC can be far more computationally intensive than ASTRAL (Mahbub et al, 2021). Additionally, wQMC was substantially less accurate than either wQFM or ASTRAL in a recent simulation study (Mahbub et al, 2021).…”
Section: Introductionmentioning
confidence: 92%
See 1 more Smart Citation
“…In recent years, many accurate summary methods statistically consistent under the MSC have been developed, such as MP-EST [20], NJst [37], ASTRAL [26], ASTRID [46], FASTRAL [8], wQFM [23]. Many of these methods are scalable to genomic-scale data, and under sufficient gene signal and ILS tend to be more accurate than concatenation [31].…”
Section: Introductionmentioning
confidence: 99%