The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2019
DOI: 10.1038/s41587-019-0333-6
|View full text |Cite
|
Sign up to set email alerts
|

Large multiple sequence alignments with a root-to-leaf regressive method

Abstract: Multiple sequence alignments (MSAs) are used for structural1,2 and evolutionary predictions1,2, but the complexity of aligning large datasets requires the use of approximate solutions3, including the progressive algorithm4. Progressive MSA methods start by aligning the most similar sequences and subsequently incorporate the remaining sequences, from leaf-to-root, based on a guide-tree. Their accuracy declines substantially as the number of sequences is scaled up5. We introduce a regressive algorithm that enabl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(33 citation statements)
references
References 24 publications
0
33
0
Order By: Relevance
“…(A) The edge leading to hummingbirds exhibits the largest number of changes to mitochondria-encoded proteins when considering all internal edges of a bird phylogenetic tree. This maximum likelihood tree was generated from an alignment of concatenated mitochondrial proteins from birds and Bos taurus using T-coffee in regressive mode ( Garriga et al 2019 ), followed by ancestral prediction using PAGAN ( Löytynoja et al 2012 ). Amino acid substitutions between each pair of ancestral and descendant nodes internal to the bird tree (node-to-node) were determined, summed across all positions, and plotted.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…(A) The edge leading to hummingbirds exhibits the largest number of changes to mitochondria-encoded proteins when considering all internal edges of a bird phylogenetic tree. This maximum likelihood tree was generated from an alignment of concatenated mitochondrial proteins from birds and Bos taurus using T-coffee in regressive mode ( Garriga et al 2019 ), followed by ancestral prediction using PAGAN ( Löytynoja et al 2012 ). Amino acid substitutions between each pair of ancestral and descendant nodes internal to the bird tree (node-to-node) were determined, summed across all positions, and plotted.…”
Section: Resultsmentioning
confidence: 99%
“…Alignments were performed by use of standalone MAFFT (version 7.407) (Katoh and Standley 2013) or by T-coffee (version 13.40.5) in regressive mode (Garriga et al 2019). For initial alignments of insect COI barcodes, MAFFT alignment was performed using an online server (Kuraku et al 2013;Katoh 2017), and translations of barcodes using the appropriate codon tables were performed using AliView (Larsson 2014).…”
Section: Methodsmentioning
confidence: 99%
“…VCF file) describing their relationship. Algorithms for calculating genome-scale multiple alignments are resource intensive 34,35 and yield a more complex structure compared to a pairwise alignment. Reference flow's use of pairwise alignments also helps to solve an "N+1" problem; adding one additional reference to the second pass requires only that we index the new genome and obtain an additional whole-genome alignment (or otherwise infer such an alignment, e.g.…”
Section: Discussionmentioning
confidence: 99%
“…High throughput sequencing technologies resulted in an unprecedented surge of genomic data. This data explosion challenges many existing analysis pipelines that rely on global multiple sequence alignments (MSA) (Nishimura et al, 2016;Garriga et al, 2019). Although alignment-free methods sequence comparisons exist (Ren et al, 2018), plenty of software still require a global alignment of all sequences (i.e.…”
Section: Introductionmentioning
confidence: 99%
“…In fact, it has been shown that alignment quality decays with an increasing number of sequences (Sievers et al, 2011). Newer software such as PASTA (Mirarab et al, 2015) and the regressive alignment algorithm (RAA) (Garriga et al, 2019) leverage traditional MSA software capabilities to create alignments for hundreds of thousands to millions of sequences. However, these strategies also suffer from the same weaknesses ( Figure S2).…”
Section: Introductionmentioning
confidence: 99%