Nicole T. Perna scite author profile

The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.

show abstract

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

Darling¹,

Mau²,

Blattner³

et al. 2004

Genome Res.

3,838

3,082

View full text Add to dashboard Cite

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under consideration. Furthermore, the linear order of these segments may be shuffled among genomes. We present methods for identification and alignment of conserved genomic DNA in the presence of rearrangements and horizontal transfer. Our methods have been implemented in a software package called Mauve. Mauve has been applied to align nine enterobacterial genomes and to determine global rearrangement structure in three mammalian genomes. We have evaluated the quality of Mauve alignments and drawn comparison to other methods through extensive simulations of genome evolution

show abstract

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

2010

View full text Add to dashboard Cite

BackgroundMultiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.Methodology/Principal FindingsWe describe a new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss (flux). We demonstrate that the new method can accurately align regions conserved in some, but not all, of the genomes, an important case not handled by our previous work. The method uses a novel alignment objective score called a sum-of-pairs breakpoint score, which facilitates accurate detection of rearrangement breakpoints when genomes have unequal gene content. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The new genome alignment algorithm demonstrates high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental gain and loss. We apply the new algorithm to a set of 23 genomes from the genera Escherichia, Shigella, and Salmonella. Analysis of whole-genome multiple alignments allows us to extend the previously defined concepts of core- and pan-genomes to include not only annotated genes, but also non-coding regions with potential regulatory roles. The 23 enterobacteria have an estimated core-genome of 2.46Mbp conserved among all taxa and a pan-genome of 15.2Mbp. We document substantial population-level variability among these organisms driven by segmental gain and loss. Interestingly, much variability lies in intergenic regions, suggesting that the Enterobacteriacae may exhibit regulatory divergence.ConclusionsThe multiple genome alignments generated by our software provide a platform for comparative genomic and population genomic studies. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve.

show abstract

Genome sequence of enterohaemorrhagic Escherichia coli O157:H7

Perna

Plunkett

Burland

et al. 2001

Nature

1,848

1,599

View full text Add to dashboard Cite

The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions--all of which could be targets for surveillance.

show abstract

Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli

Welch¹,

Burland²,

Plunkett³

et al. 2002

Proc. Natl. Acad. Sci. U.S.A.

1,342

1,174

View full text Add to dashboard Cite

We present the complete genome sequence of uropathogenic Escherichia coli, strain CFT073. A three-way genome comparison of the CFT073, enterohemorrhagic E. coli EDL933, and laboratory strain MG1655 reveals that, amazingly, only 39.2% of their combined (nonredundant) set of proteins actually are common to all three strains. The pathogen genomes are as different from each other as each pathogen is from the benign strain. The difference in disease potential between O157:H7 and CFT073 is reflected in the absence of genes for type III secretion system or phage-and plasmid-encoded toxins found in some classes of diarrheagenic E. coli. The CFT073 genome is particularly rich in genes that encode potential fimbrial adhesins, autotransporters, iron-sequestration systems, and phase-switch recombinases. Striking differences exist between the large pathogenicity islands of CFT073 and two other well-studied uropathogenic E. coli strains, J96 and 536. Comparisons indicate that extraintestinal pathogenic E. coli arose independently from multiple clonal lineages. The different E. coli pathotypes have maintained a remarkable synteny of common, vertically evolved genes, whereas many islands interrupting this common backbone have been acquired by different horizontal transfer events in each strain.

show abstract

Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes

1995

View full text Add to dashboard Cite

Three statistics (%GC, GC-skew, and AT-skew) can be used to describe the overall patterns of nucleotide composition in DNA sequences. Fourfold degenerate third codon positions from 16 animal mitochondrial genomes were analyzed. The overall composition, as measured by %GC, varies from 3.6 %GC in the honeybee to 47.2 %GC in human mtDNA. Compositional differences between strands of the mitochondrial genome were quantified using the two skew statistics presented in this paper. Strand-specific distribution of bases varies among animal taxa independently of overall %GC. Compositional patterns reflect the substitution process. Description of these patterns may aid in the formation of hypotheses about substitutional mechanisms.

show abstract

Reordering contigs of draft genomes using the Mauve Aligner

et al. 2009

View full text Add to dashboard Cite

Summary: Mauve Contig Mover provides a new method for proposing the relative order of contigs that make up a draft genome based on comparison to a complete or draft reference genome. A novel application of the Mauve aligner and viewer provides an automated reordering algorithm coupled with a powerful drill-down display allowing detailed exploration of results.Availability: The software is available for download at http://gel.ahabs.wisc.edu/mauve.Contact: rissman@wisc.eduSupplementary information: Supplementary data are available at Bioinformatics online and http://gel.ahabs.wisc.edu

show abstract

Genome Sequence of Yersinia pestis KIM

et al. 2002

View full text Add to dashboard Cite

We present the complete genome sequence of Yersinia pestis KIM, the etiologic agent of bubonic and pneumonic plague. The strain KIM, biovar Mediaevalis, is associated with the second pandemic, including the Black Death. The 4.6-Mb genome encodes 4,198 open reading frames (ORFs). The origin, terminus, and most genes encoding DNA replication proteins are similar to those of Escherichia coli K-12. The KIM genome sequence was compared with that of Y. pestis CO92, biovar Orientalis, revealing homologous sequences but a remarkable amount of genome rearrangement for strains so closely related. The differences appear to result from multiple inversions of genome segments at insertion sequences, in a manner consistent with present knowledge of replication and recombination. There are few differences attributable to horizontal transfer. The KIM and E. coli K-12 genome proteins were also compared, exposing surprising amounts of locally colinear "backbone," or synteny, that is not discernible at the nucleotide level. Nearly 54% of KIM ORFs are significantly similar to K-12 proteins, with conserved housekeeping functions. However, a number of E. coli pathways and transport systems and at least one global regulator were not found, reflecting differences in lifestyle between them. In KIM-specific islands, new genes encode candidate pathogenicity proteins, including iron transport systems, putative adhesins, toxins, and fimbriae.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nicole T. Perna

The Complete Genome Sequence of Escherichia coli K-12

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

Genome sequence of enterohaemorrhagic Escherichia coli O157:H7

Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli

Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes

Reordering contigs of draft genomes using the Mauve Aligner

Genome Sequence of Yersinia pestis KIM

Contact Info

Product

Resources

About