Summary Next-generation DNA sequencing (NGS) can be used to reconstruct eco-evolutionary population dynamics and to identify the genetic basis of adaptation in laboratory evolution experiments. Here, we describe how to run the open-source breseq computational pipeline to identify and annotate genetic differences found in whole-genome and whole-population NGS data from haploid microbes where a high-quality reference genome is available. These methods can also be used to analyze mutants isolated in genetic screens and to detect unintended mutations that may occur during strain construction and genome editing.
Adaptation by natural selection depends on the rates, effects, and interactions of many mutations, making it difficult to determine what proportion of mutations in an evolving lineage are beneficial. We analysed 264 complete genomes from 12 Escherichia coli populations to characterize their dynamics over 50,000 generations. The populations that retained the ancestral mutation rate support a model where most fixed mutations are beneficial, the fraction of beneficial mutations declines as fitness rises, and neutral mutations accumulate at a constant rate. We also compared these populations to mutation-accumulation lines evolved under a bottlenecking regime that minimizes selection. Nonsynonymous mutations, intergenic mutations, insertions, and deletions are overrepresented in the long-term populations, further supporting the inference that most mutations that reached high frequency were favoured by selection. These results illuminate the shifting balance of forces that govern genome evolution in populations adapting to a new environment.
*These authors contributed equally to this work.Adaptation depends on the rates, effects, and interactions of many mutations. We analyzed 264 genomes from 12 Escherichia coli populations to characterize their dynamics over 50,000 generations. The trajectories for genome evolution in populations that retained the ancestral mutation rate fit a model where most fixed mutations are beneficial, the fraction of beneficial mutations declines as fitness rises, and neutral mutations accumulate at a constant rate. We also compared these populations to lines evolved under a mutation--accumulation regime that minimizes selection. Nonsynonymous mutations, intergenic mutations, insertions, and deletions are overrepresented in the long--term populations, supporting the inference that most fixed mutations are favored by selection. These results illuminate the shifting balance of forces that govern genome evolution in populations adapting to a new environment.All rights reserved. No reuse allowed without permission.(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
BackgroundMutations that alter chromosomal structure play critical roles in evolution and disease, including in the origin of new lifestyles and pathogenic traits in microbes. Large-scale rearrangements in genomes are often mediated by recombination events involving new or existing copies of mobile genetic elements, recently duplicated genes, or other repetitive sequences. Most current software programs for predicting structural variation from short-read DNA resequencing data are intended primarily for use on human genomes. They typically disregard information in reads mapping to repeat sequences, and significant post-processing and manual examination of their output is often required to rule out false-positive predictions and precisely describe mutational events.ResultsWe have implemented an algorithm for identifying structural variation from DNA resequencing data as part of the breseq computational pipeline for predicting mutations in haploid microbial genomes. Our method evaluates the support for new sequence junctions present in a clonal sample from split-read alignments to a reference genome, including matches to repeat sequences. Then, it uses a statistical model of read coverage evenness to accept or reject these predictions. Finally, breseq combines predictions of new junctions and deleted chromosomal regions to output biologically relevant descriptions of mutations and their effects on genes. We demonstrate the performance of breseq on simulated Escherichia coli genomes with deletions generating unique breakpoint sequences, new insertions of mobile genetic elements, and deletions mediated by mobile elements. Then, we reanalyze data from an E. coli K-12 mutation accumulation evolution experiment in which structural variation was not previously identified. Transposon insertions and large-scale chromosomal changes detected by breseq account for ~25% of spontaneous mutations in this strain. In all cases, we find that breseq is able to reliably predict structural variation with modest read-depth coverage of the reference genome (>40-fold).ConclusionsUsing breseq to predict structural variation should be useful for studies of microbial epidemiology, experimental evolution, synthetic biology, and genetics when a reference genome for a closely related strain is available. In these cases, breseq can discover mutations that may be responsible for important or unintended changes in genomes that might otherwise go undetected.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-1039) contains supplementary material, which is available to authorized users.
Genetic amplification, mutation, and translocation are known to play a causal role in the upregulation of an oncogene in cancer cells. Here, we report an emerging role of microRNA, the epigenetic deregulation of which may also lead to this oncogenic activation. SOX4, an oncogene belonging to the SRY-related high mobility group box family, was found to be overexpressed (P < 0.005) in endometrial tumors (n = 74) compared with uninvolved controls (n = 20). This gene is computationally predicted to be the target of a microRNA, miR-129-2. When compared with the matched endometria, the expression of miR-129-2 was lost in 27 of 31 primary endometrial tumors that also showed a concomitant gain of SOX4 expression (P < 0.001). This inverse relationship is associated with hypermethylation of the miR-129-2 CpG island, which was observed in endometrial cancer cell lines (n = 6) and 68% of 117 endometrioid endometrial tumors analyzed. Reactivation of miR-129-2 in cancer cells by pharmacologic induction of histone acetylation and DNA demethylation resulted in decreased SOX4 expression. In addition, restoration of miR-129-2 by cell transfection led to decreased SOX4 expression and reduced proliferation of cancer cells. Further analysis found a significant correlation of hypermethylated miR-129-2 with microsatellite instability and MLH1 methylation status (P < 0.001) and poor overall survival (P < 0.039) in patients. Therefore, these results imply that the aberrant expression of SOX4 is, in part, caused by epigenetic repression of miR-129-2 in endometrial cancer. Unlike the notion that promoter hypomethylation may upregulate an oncogene, we present a new paradigm in which hypermethylation-mediated silencing of a microRNA derepresses its oncogenic target in cancer cells. [Cancer Res 2009;69(23):9038-46]
Large-scale rearrangements may be important in evolution because they can alter chromosome organization and gene expression in ways not possible through point mutations. In a long-term evolution experiment, twelve Escherichia coli populations have been propagated in a glucose-limited environment for over 25 years. We used whole-genome mapping (optical mapping) combined with genome sequencing and PCR analysis to identify the large-scale chromosomal rearrangements in clones from each population after 40,000 generations. A total of 110 rearrangement events were detected, including 82 deletions, 19 inversions, and 9 duplications, with lineages having between 5 and 20 events. In three populations, successive rearrangements impacted particular regions. In five populations, rearrangements affected over a third of the chromosome. Most rearrangements involved recombination between insertion sequence (IS) elements, illustrating their importance in mediating genome plasticity. Two lines of evidence suggest that at least some of these rearrangements conferred higher fitness. First, parallel changes were observed across the independent populations, with ~65% of the rearrangements affecting the same loci in at least two populations. For example, the ribose-utilization operon and the manB-cpsG region were deleted in 12 and 10 populations, respectively, suggesting positive selection, and this inference was previously confirmed for the former case. Second, optical maps from clones sampled over time from one population showed that most rearrangements occurred early in the experiment, when fitness was increasing most rapidly. However, some rearrangements likely occur at high frequency and may have simply hitchhiked to fixation. In any case, large-scale rearrangements clearly influenced genomic evolution in these populations.
Isolated populations derived from a common ancestor are expected to diverge genetically and phenotypically as they adapt to different local environments. To examine this process, 30 populations ofEscherichia coliwere evolved for 2,000 generations, with six in each of five different thermal regimes: constant 20 °C, 32 °C, 37 °C, 42 °C, and daily alternations between 32 °C and 42 °C. Here, we sequenced the genomes of one endpoint clone from each population to test whether the history of adaptation in different thermal regimes was evident at the genomic level. The evolved strains had accumulated ∼5.3 mutations, on average, and exhibited distinct signatures of adaptation to the different environments. On average, two strains that evolved under the same regime exhibited ∼17% overlap in which genes were mutated, whereas pairs that evolved under different conditions shared only ∼4%. For example, all six strains evolved at 32 °C had mutations innadR, whereas none of the other 24 strains did. However, a population evolved at 37 °C for an additional 18,000 generations eventually accumulated mutations in the signature genes strongly associated with adaptation to the other temperature regimes. Two mutations that arose in one temperature treatment tended to be beneficial when tested in the others, although less so than in the regime in which they evolved. These findings demonstrate that genomic signatures of adaptation can be highly specific, even with respect to subtle environmental differences, but that this imprint may become obscured over longer timescales as populations continue to change and adapt to the shared features of their environments.
Early exposure to xenoestrogens may predispose to breast cancer risk later in adult life. It is likely that long-lived, selfregenerating epithelial progenitor cells are more susceptible to these exposure injuries over time and transmit the injured memory through epigenetic mechanisms to their differentiated progeny. Here, we used progenitor-containing mammospheres as an in vitro exposure model to study this epigenetic effect. Expression profiling identified that, relative to control cells, 9.1% of microRNAs (82 of 898 loci) were altered in epithelial progeny derived from mammospheres exposed to a synthetic estrogen, diethylstilbestrol. Repressive chromatin marks, trimethyl Lys27 of histone H3 (H3K27me3) and dimethyl Lys9 of histone H3 (H3K9me2), were found at a down-regulated locus, miR-9-3, in epithelial cells preexposed to diethylstilbestrol. This was accompanied by recruitment of DNA methyltransferase 1 that caused an aberrant increase in DNA methylation of its promoter CpG island in mammosphere-derived epithelial cells on diethylstilbestrol preexposure. Functional analyses suggest that miR-9-3 plays a role in the p53-related apoptotic pathway. Epigenetic silencing of this gene, therefore, reduces this cellular function and promotes the proliferation of breast cancer cells. Promoter hypermethylation of this microRNA may be a hallmark for early breast cancer development, and restoration of its expression by epigenetic and microRNA-based therapies is another viable option for future treatment of this disease. [Cancer Res 2009;69(14):5936-45]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.