2022
DOI: 10.1038/s41587-021-01111-2
|View full text |Cite
|
Sign up to set email alerts
|

Deep distributed computing to reconstruct extremely large lineage trees

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(11 citation statements)
references
References 50 publications
0
11
0
Order By: Relevance
“…Recently, methods were developed that can incorporate a priori information on the frequency of editing outcomes [ 10 12 ], which helps reduce biased inference due to homoplasy or the exclusive use of pairwise distances. While some approaches focus on improving scalability to reconstruct trees with millions of cells [ 10 , 13 ], others focus on detailed modelling of the editing process to enable more accurate inference [ 12 ]. A key example of the latter is the maximum-likelihood framework GAPML [ 12 ], which models the editing process of a GESTALT recorder [ 2 ].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, methods were developed that can incorporate a priori information on the frequency of editing outcomes [ 10 12 ], which helps reduce biased inference due to homoplasy or the exclusive use of pairwise distances. While some approaches focus on improving scalability to reconstruct trees with millions of cells [ 10 , 13 ], others focus on detailed modelling of the editing process to enable more accurate inference [ 12 ]. A key example of the latter is the maximum-likelihood framework GAPML [ 12 ], which models the editing process of a GESTALT recorder [ 2 ].…”
Section: Introductionmentioning
confidence: 99%
“…However, such inferred phylogenetic trees are inherently error-prone for two reasons. First, exhaustive phylogenetic tree search to guarantee optimality is not computationally practical for hundreds of terminal cells, and practical heuristic algorithms do not guarantee optimality [26] despite the recent advances in distributed computing [27]. Second, barcoding strategies employ a limited number of mutation sites which encode a limited amount of information; how close any inferred tree—including the optimal tree—is to its true phylogeny remains uncertain.…”
Section: Resultsmentioning
confidence: 99%
“…This includes genomic datasets with many samples from an individual pathogen, including for example large collections of M. tuberculosis genomes 43 or influenza genomes 44 , and collections of genomic data from possible future pandemics. Our approach could also be combined with divide-and-conquer phylogenetic algorithms 45,46 to further improve its performance and applicability.…”
Section: Discussionmentioning
confidence: 99%