2021
DOI: 10.1101/2021.01.22.427808
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DEPP: Deep Learning Enables Extending Species Trees using Single Genes

Abstract: Identifying samples in an evolutionary context is a fundamental step in the study of microbiome, and more broadly, biodiversity. Extending a reference phylogeny by placing new query sequences onto it has been increasingly used for sample identification and other applications. Existing phylogenetic placement methods have assumed that the query sequence is homologous to the data used to infer the reference phylogeny. Thus, they are designed to place data from a single gene onto a gene tree (e.g., they can place … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 88 publications
(143 reference statements)
0
13
0
Order By: Relevance
“…In this framework, the input to APPLES is a reference (a.k.a backbone) phylogenetic tree T with n leaves and a vector of distances δ qi between a query taxon q and every taxon i on T . Although machine-learning based methods shows substantial promise (Jiang et al, 2021), typically, input distances are obtained by calculating sequence distances between query and backbone taxa followed by a phylogenetic correction using a statistically consistent method under a model such as Jukes-Cantor (JC69) (Jukes and Cantor, 1969). APPLES introduced a dynamic programming algorithm to find a placement of q that minimizes weighted least squares error n i=1 w qi (δ qi − d qi (T )) 2 where d qi (T ) represents the path distance from q to backbone taxon i on T .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this framework, the input to APPLES is a reference (a.k.a backbone) phylogenetic tree T with n leaves and a vector of distances δ qi between a query taxon q and every taxon i on T . Although machine-learning based methods shows substantial promise (Jiang et al, 2021), typically, input distances are obtained by calculating sequence distances between query and backbone taxa followed by a phylogenetic correction using a statistically consistent method under a model such as Jukes-Cantor (JC69) (Jukes and Cantor, 1969). APPLES introduced a dynamic programming algorithm to find a placement of q that minimizes weighted least squares error n i=1 w qi (δ qi − d qi (T )) 2 where d qi (T ) represents the path distance from q to backbone taxon i on T .…”
Section: Methodsmentioning
confidence: 99%
“…A main attraction of placing new sequences onto an existing phylogeny is computational expediency: the running time of phylogenetic placement is a fraction of the time needed for de novo reconstruction and can grow linearly with the number query samples assuming they are placed independently. To take advantage of this potential, many methods have been developed using a wide range of algorithmic techniques (e.g., Barbera et al, 2019;Brown and Truszkowski, 2013;Jiang et al, 2021;Linard et al, 2019;Matsen et al, 2010;Mirarab et al, 2011;Rabiee and Mirarab, 2018;Stark et al, 2010;Zheng et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Given an existing species tree T , INSTRAL will add the new species into the existing tree to optimize the quartet tree support for the extended species tree (i.e., INSTRAL extends the theoretical approach in ASTRAL). Another new method is DEPP (Jiang et al, 2021), which computes distances using a deep neural network (DNN) and then runs APPLES to place the new species into the tree. By training the DNN appropriately, these distances can be appropriate to this problem of adding species into species trees.…”
Section: Adding Species To Species Treesmentioning
confidence: 99%
“…To take advantage of this potential, many methods have been developed using a wide range of algorithmic techniques (e.g. Balaban & Mirarab, 2020; Barbera et al, 2019; Brown & Truszkowski, 2013; Jiang et al, 2021; Linard et al, 2019; Matsen et al, 2010; Mirarab et al, 2011; Rabiee & Mirarab, 2018; Stark et al, 2010; Zheng et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…a backbone) phylogenetic tree T with n leaves and a vector of distances δqi between a query taxon q and every taxon i on T . Although machine learning‐based methods show substantial promise (Jiang et al, 2021), typically, input distances are obtained by calculating sequence distances between query and backbone taxa followed by a phylogenetic correction using a statistically consistent method under a model such as Jukes–Cantor (JC69) (Jukes & Cantor, 1969). APPLES introduced a dynamic programming algorithm to find a placement of q that minimizes weighted least squares error i=1nwqiδitalicqinormalditalicqifalse(Tfalse)2 where normalditalicqiT represents the path distance from q to backbone taxon i on T .…”
Section: Introductionmentioning
confidence: 99%