Breaking Good: Accounting for Fragility of Genomic Regions in Rearrangement Distance Estimation

Biller, Priscila; Guéguen, Laurent; Knibbe, Carole; Tannier, Éric

doi:10.1093/gbe/evw083

Cited by 43 publications

(39 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Simulations with Zombi are fast: with a starting genome of 500 genes and a species tree of 2000 taxa (extinct + extant), it takes around 1 minute on a 3.4Ghz laptop to simulate all the genomes ( Figure S6). We validated that the distribution of waiting times between successive events was following an exponential distribution ( Figure S7 and S8), that the distribution of intergene sizes at equilibrium was following a flat Dirichlet distribution, as expected from Biller et al 2016 ( Figure S9), that the number of events and their extension occurs with a frequency according to their respective rates ( Figure S10) and that the gene family size distribution followed a power-law when duplication rates are higher than loss rates and stretched-exponential in the opposite case ( Figure S11). We also checked by hand the validity of many simple scenarios to detect possible inconsistencies in the algorithm.…”

Section: Performance and Validationsupporting

confidence: 72%

“…For example, it is possible to use a species tree input by the user, to generate species trees with variable extinction and speciation rates, or to control the number of living lineages at each unit of time ( Figure S5). At the genome level, Zombi can simulate genomes using branch-specific rates (Gu mode, allowing the user to simulate very specific scenarios such as one in which a certain lineage experiences a massive loss of genes), gene-family specific rates (Gm mode, which makes easier the process of using rates estimated from real datasets) and genomes accounting for intergenic regions (Gf mode) of variable length (drawn from a flat Dirichlet distribution (Biller et al 2016) . At the sequence level, finally, the user can fine-tune the substitution rates to make them branch specific.…”

Section: Advanced Featuresmentioning

confidence: 99%

See 1 more Smart Citation

Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead lineages

Davín

Tricou

Tannier

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

Most living organisms that ever existed on Earth have left no descendants. Because introgressions and lateral gene transfers are frequent, some of these extinct lineages have impacted the evolution of extant species and their ancestors. As a consequence, ignoring extinct lineages in evolutionary studies can lead to spurious conclusions. Here we present Zombi, a platform to simulate the evolution of species, genes and genomes taking extinct lineages into account. We demonstrate its utility by testing a statistical inference method used to detect introgression and show that ignoring the presence of extinct lineages yields inconsistent results.

show abstract

Section: Performance and Validationsupporting

confidence: 72%

Section: Advanced Featuresmentioning

confidence: 99%

Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead lineages

Davín

Tricou

Tannier

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…First, the definition of weighted genomes [1, 3] opens combinatorial questions, one of which being the transformation of a genome into another in a minimum number of steps. In a previous paper [3] we solved the strict version of this problem, where genomes were forced to have the same total intergene sizes and only wDCJs were allowed.…”

Section: Discussionmentioning

confidence: 99%

“…In a previous publication [1], we have argued that intergenic sizes were a crucial parameter to infer genome rearrangement distances. Indeed, ignoring this information, as all published distance estimations were doing so far [2], leads to strong biases in all estimations and validation procedures.…”

Section: Introductionmentioning

confidence: 99%

“…Indeed it is known that such a space is huge [4, 5], which makes it hard to analyze; several methods have thus been devised to add genomic or epigenomic constraints to restrict the search space [6–8]. So far, the potential of intergenic sizes has only been explored for distance computations [1, 3]. We show that it can also contain information on the scenarios, by characterizing categories of DCJs that can be used in optimal DCJs and indels scenarios.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Genome rearrangements with indels in intergenes restrict the scenario space

2016

Self Cite

View full text Add to dashboard Cite

BackgroundGiven two genomes that have diverged by a series of rearrangements, we infer minimum Double Cut-and-Join (DCJ) scenarios to explain their organization differences, coupled with indel scenarios to explain their intergene size distribution, where DCJs themselves also alter the sizes of broken intergenes.ResultsWe give a polynomial-time algorithm that, given two genomes with arbitrary intergene size distributions, outputs a DCJ scenario which optimizes on the number of DCJs, and given this optimal number of DCJs, optimizes on the total sum of the sizes of the indels.ConclusionsWe show that there is a valuable information in the intergene sizes concerning the rearrangement scenario itself. On simulated data we show that statistical properties of the inferred scenarios are closer to the true ones than DCJ only scenarios, i.e. scenarios which do not handle intergene sizes.

show abstract

Comparative Methods for Reconstructing Ancient Genome Organization

Anselmetti

Luhmann

Bérard

et al. 2017

Comparative Genomics

Self Cite

View full text Add to dashboard Cite

Comparative genomics considers the detection of similarities and differences between extant genomes, and, based on more or less formalized hypotheses regarding the involved evolutionary processes, inferring ancestral states explaining the similarities and an evolutionary history explaining the differences. In this chapter, we focus on the reconstruction of the organization of ancient genomes into chromosomes. We review different methodological approaches and software, applied to a wide range of datasets from different kingdoms of life and at different evolutionary depths. We discuss relations with genome assembly, and potential approaches to validate computational predictions on ancient genomes that are almost always only accessible through these predictions.

show abstract

Breaking Good: Accounting for Fragility of Genomic Regions in Rearrangement Distance Estimation

Cited by 43 publications

References 52 publications

Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead lineages

Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead lineages

Genome rearrangements with indels in intergenes restrict the scenario space

Comparative Methods for Reconstructing Ancient Genome Organization

Contact Info

Product

Resources

About