2020
DOI: 10.1101/2020.04.20.049999
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Disentangling biological and analytical factors that give rise to outlier genes in phylogenomic matrices

Abstract: The genomic data revolution has enabled biologists to develop innovative ways to infer key episodes in the history of life. Whether genome-scale data will eventually resolve all branches of the Tree of Life remains uncertain. However, through novel means of interrogating data, some explanations for why evolutionary relationships remain recalcitrant are emerging. Here, we provide four biological and analytical factors that explain why certain genes may exhibit "outlier" behavior, namely, rate of molecular evolu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 49 publications
(78 reference statements)
0
7
0
Order By: Relevance
“…Where Xenacoelomorpha place in the animal tree of life has profound implications for our understanding of animal evolution. The patterns in the data previously applied to this problem indicate low levels of signal (Figure 2), suggesting a small proportion of genes with strong signal which are contributing to the placement of Xenacoelomorpha (Shen et al 2017;Brown and Thomson 2017;Di Franco et al 2019;Walker et al 2020). The issue of inadequate or misleading signal within phylogenomic studies is often overlooked in favour of appropriate model selection or the size of data matrices (Philippe, Brinkmann, Lavrov, et al 2011).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Where Xenacoelomorpha place in the animal tree of life has profound implications for our understanding of animal evolution. The patterns in the data previously applied to this problem indicate low levels of signal (Figure 2), suggesting a small proportion of genes with strong signal which are contributing to the placement of Xenacoelomorpha (Shen et al 2017;Brown and Thomson 2017;Di Franco et al 2019;Walker et al 2020). The issue of inadequate or misleading signal within phylogenomic studies is often overlooked in favour of appropriate model selection or the size of data matrices (Philippe, Brinkmann, Lavrov, et al 2011).…”
Section: Discussionmentioning
confidence: 99%
“…Most of these large scale studies set out to reduce paralogy, yet a standardised assessment of the prevailing levels of hidden paralogy is lacking. Hidden paralogy (Doolittle & Brown 1994) is driven by gene duplication followed by subsequent differential loss, and has been shown to have profound effects on species tree inference (Supplementary Figure S1A) (Brown & Thomson 2017; Siu-Ting et al 2019; Walker et al 2020; Natsidis et al 2021). Given the observed high rates of genome duplication and gene turnover in animals (Grau-Bové et al 2017; Richter et al 2018; Paps & Holland 2018; Fernández & Gabaldón 2020; Guijarro-Clarke et al 2020), combined with the use of incomplete transcriptomic and genomic data in phylogenomic analyses, hidden paralogy and misleading phylogenetic signal remains a significant concern for species tree inference.…”
Section: Introductionmentioning
confidence: 99%
“…For concatenated alignments we exported per site log-likelihoods and summed them for each included locus according to its coordinates in the alignment (Lee and Hugall 2003; Shen et al 2017). For both types of likelihood signal assessment we scaled locus log-likelihood by locus length to estimate a mean per site log-likelihood and to avoid bias associated with locus length (Walker et al 2020).…”
Section: Methodsmentioning
confidence: 99%
“…The approach of fitting loci individually was used to minimize the probability of a single underlying tree assumption violation, which is very likely in the concatenation framework (Rannala et al 2020). We scaled locus log-likelihood by locus length to estimate a mean per site log-likelihood and to avoid bias associated with locus length (Walker et al 2020). Because the loci were on average very short, we hypothesized that the estimates of model parameters may be too inaccurate compared to model estimates for longer Interrogating Phylogenetic Placement Of Treeshrews alignments (Xia 2020).…”
Section: Phylogenetic Signal Assessmentmentioning
confidence: 99%
See 1 more Smart Citation