2021
DOI: 10.1101/2021.12.22.473813
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ModelRevelator: Fast phylogenetic model estimation via deep learning

Abstract: Selecting the best model of sequence evolution for a multiple sequence alignment (MSA) constitutes the first step of phylogenetic tree reconstruction. Common approaches for inferring nucleotide models typically apply maximum likelihood (ML) methods, with discrimination between models determined by one of several information criteria. This requires tree reconstruction and optimisation which can be computationally expensive. We demonstrate that neural networks can be used to perform model selection, without the … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 49 publications
(57 reference statements)
0
3
0
Order By: Relevance
“…Burgstaller-Muehlbacher et al (2021) analysed images of alignments of 8, 16, 64, 128, 256 and 1024 taxa with CNNs to determine the best model of sequence evolution and to estimate the shape parameter of the gamma distribution. They compared the model prediction accuracies with those of ModelFinder with BIC and found comparable accuracies between the methods.…”
Section: Discussion and Comparison Of Resultsmentioning
confidence: 99%
“…Burgstaller-Muehlbacher et al (2021) analysed images of alignments of 8, 16, 64, 128, 256 and 1024 taxa with CNNs to determine the best model of sequence evolution and to estimate the shape parameter of the gamma distribution. They compared the model prediction accuracies with those of ModelFinder with BIC and found comparable accuracies between the methods.…”
Section: Discussion and Comparison Of Resultsmentioning
confidence: 99%
“…Another major consideration is how to encode input data for neural networks. Most commonly, encoded alignments (Suvorov & Schrider, 2022;Suvorov et al, 2020;Zou et al, 2020), or summary statistics (Abadi et al, 2020;Burgstaller-Muehlbacher et al, 2023) have been used as input. When using encoded alignments, a primary challenge is scalability to longer alignments or more taxa.…”
Section: Discussionmentioning
confidence: 99%
“…A later model, ModelRevelator (Burgstaller-Muehlbacher et al, 2023) aims to infer the correct generating model of nucleotide substitution using two neural networks. The first network, NNmodelfinder, takes as input a set of statistics calculated from pairwise alignments and predicts the best substitution model from a set of six possible models.…”
Section: Substitution Modelsmentioning
confidence: 99%
“…However, many applications require simulations of a vast number of large MSAs. For example, training new machine learning applications for phylogenetics (Suvorov and Schrider, 2022; Burgstaller-Muehlbacher et al ., 2021; Abadi et al ., 2020; Suvorov et al ., 2020) requires millions of simulated MSAs. Due to its sequential implementation, AliSim becomes very slow, taking several days or weeks to simulate millions of alignments.…”
Section: Introductionmentioning
confidence: 99%