The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.[Supplemental material is available online at www.genome.org. Bioperl is available as open-source software free of charge and is licensed under the Perl Artistic License (http://www.perl.com/pub/a/language/misc/Artistic.html). It is available for download at http://www.bioperl.org. Support inquiries should be addressed to bioperl-l@bioperl.org.]
In May of 2011, an enteroaggregative Escherichia coli O104:H4 strain that had acquired a Shiga toxin 2-converting phage caused a large outbreak of bloody diarrhea in Europe which was notable for its high prevalence of hemolytic uremic syndrome cases. Several studies have described the genomic inventory and phylogenies of strains associated with the outbreak and a collection of historical E. coli O104:H4 isolates using draft genome assemblies. We present the complete, closed genome sequences of an isolate from the 2011 outbreak (2011C–3493) and two isolates from cases of bloody diarrhea that occurred in the Republic of Georgia in 2009 (2009EL–2050 and 2009EL–2071). Comparative genome analysis indicates that, while the Georgian strains are the nearest neighbors to the 2011 outbreak isolates sequenced to date, structural and nucleotide-level differences are evident in the Stx2 phage genomes, the mer/tet antibiotic resistance island, and in the prophage and plasmid profiles of the strains, including a previously undescribed plasmid with homology to the pMT virulence plasmid of Yersinia pestis. In addition, multiphenotype analysis showed that 2009EL–2071 possessed higher resistance to polymyxin and membrane-disrupting agents. Finally, we show evidence by electron microscopy of the presence of a common phage morphotype among the European and Georgian strains and a second phage morphotype among the Georgian strains. The presence of at least two stx2 phage genotypes in host genetic backgrounds that may derive from a recent common ancestor of the 2011 outbreak isolates indicates that the emergence of stx2 phage-containing E. coli O104:H4 strains probably occurred more than once, or that the current outbreak isolates may be the result of a recent transfer of a new stx2 phage element into a pre-existing stx2-positive genetic background.
The genome of the flowering plant Arabidopsis thaliana has five chromosomes. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNA(Pro) genes and the other contains 27 tandem repeats of tRNA(Tyr)-tRNA(Tyr)-tRNA(Ser) genes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.
We have isolated and mutagenized a DNA fragment from Saccharomyces cerevisiae that specifies mRNA 3' end formation for the convergently transcribed CYCI and UTRI genes. An in vivo plasmid supercoiling assay previously showed that this fragment is a transcriptional terminator, and "run-on" assays shown here are consistent with this interpretation. The poly(A) sites in the mRNAs formed by the fragment are the same whether the fragment resides at the native location or at a heterologous location. No single linker substitution abolishes the fragment's activity, whereas certain large, nonoverlapping deletions have strong, deleterious effects. Therefore, the yeast terminator behaves more like rhodependent bacterial terminators than terminators of higher eukaryotes. That a number of deletions or substitutions have different effects in the two orientations suggests that the fragment contains the sequences of two, unidirectional terminator elements.The primary structure of the eukaryotic mRNA is due to precise sites of transcriptional initiation and RNA processing events, including splicing and end formation (reviewed in refs. 1-4). The clearest models of mRNA 3' end formation emerge from genetic and biochemical experimentation in metazoan cells. Three general events have been described. (i) Transcription by RNA polymerase II appears to terminate hundreds of base pairs downstream from the eventual junction of the template-derived RNA and the poly(A) tract (5-8). DNA fragments from those downstream regions that act as terminators have been isolated, but their fine structure and their mode of action has not yet been elucidated (9). (ii) The sequence AAUAAA in the RNA directs cleavage 10-30 bases 3' to the hexanucleotide element (10-15). Extensive mutagenesis of this sequence and its immediate environment indicates that AAUAAA is essential for proper processing and that sequences immediately 3' to the cleavage site are also required (16)(17)(18)(19). (iii) The cleaved RNA serves as a substrate for processive polyadenylylation (20, 21). Macromolecular complexes that carry out the latter two reactions have been isolated, and small nuclear ribonucleoproteins are thought to participate (12,21,22). Remarkably, evidence has been provided that indicates that all of these processes are coupled in vivo, as mutations within the AATAAA strongly decrease the efficiency of distant transcriptional termination (23-25).How similar is 3' end formation in yeast to what occurs in higher cells? The consensus element TAG. .TAGT. .TTT has been proposed as a key element in termination for many genes, including CYCI (iso-1-cytochrome c) of Saccharomyces cerevisiae (26). To begin an analysis of termination in yeast, we chose to study an 83-base-pair (bp) fragment past the 3' end of CYCI that includes the consensus element. CYCI and an adjacent gene, UTRI, are convergently transcribed and 3' ends for both mRNAs fall within the 83-bp region (Fig. 1).We showed previously that transcription of CYCJ-lacZ on a plasmid in S. cerevisiae resulted in...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.