The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.
The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions--all of which could be targets for surveillance.
The low-Ca2+-response (LCR) plasmid pCD1 of the plague agent Yersinia pestis KIM5 was sequenced and analyzed for its genetic structure. pCD1 (70,509 bp) has an IncFIIA-like replicon and a SopABC-like partition region. We have assigned 60 apparently intact open reading frames (ORFs) that are not contained within transposable elements. Of these, 47 are proven or possible members of the LCR, a major virulence property of human-pathogenicYersinia spp., that had been identified previously in one or more of Y. pestis or the enteropathogenic yersiniaeYersinia enterocolitica and Yersinia pseudotuberculosis. Of these 47 LCR-related ORFs, 35 constitute a continuous LCR cluster. The other LCR-related ORFs are interspersed among three intact insertion sequence (IS) elements (IS100and two new IS elements, IS1616 and IS1617) and numerous defective or partial transposable elements. Regional variations in percent GC content and among ORFs encoding effector proteins of the LCR are additional evidence of a complex history for this plasmid. Our analysis suggested the possible addition of a new Syc- and Yop-encoding operon to the LCR-related pCD1 genes and gave no support for the existence of YopL. YadA likely is not expressed, as was the case for Y. pestis EV76, and the gene for the lipoprotein YlpA found in Y. enterocolitica likely is a pseudogene in Y. pestis. The yopM gene is longer than previously thought (by a sequence encoding two leucine-rich repeats), the ORF upstream of ypkA-yopJ is discussed as a potential Syc gene, and a previously undescribed ORF downstream ofyopE was identified as being potentially significant. Eight other ORFs not associated with IS elements were identified and deserve future investigation into their functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.