The nematode worm Caenorhabditis elegans and its relatives are unique among animals in having operons. Operons are regulated multigene transcription units, in which polycistronic pre-messenger RNA (pre-mRNA coding for multiple peptides) is processed to monocistronic mRNAs. This occurs by 3' end formation and trans-splicing using the specialized SL2 small nuclear ribonucleoprotein particle for downstream mRNAs. Previously, the correlation between downstream location in an operon and SL2 trans-splicing has been strong, but anecdotal. Although only 28 operons have been reported, the complete sequence of the C. elegans genome reveals numerous gene clusters. To determine how many of these clusters represent operons, we probed full-genome microarrays for SL2-containing mRNAs. We found significant enrichment for about 1,200 genes, including most of a group of several hundred genes represented by complementary DNAs that contain SL2 sequence. Analysis of their genomic arrangements indicates that >90% are downstream genes, falling in 790 distinct operons. Our evidence indicates that the genome contains at least 1,000 operons, 2 8 genes long, that contain about 15% of all C. elegans genes. Numerous examples of co-transcription of genes encoding functionally related proteins are evident. Inspection of the operon list should reveal previously unknown functional relationships.
BackgroundManually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text.ResultsThis paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement.ConclusionsAs the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
Trisomy 21 (T21) causes Down syndrome (DS), but the mechanisms by which T21 produces the different disease spectrum observed in people with DS are unknown. We recently identified an activated interferon response associated with T21 in human cells of different origins, consistent with overexpression of the four interferon receptors encoded on chromosome 21, and proposed that DS could be understood partially as an interferonopathy. However, the impact of T21 on systemic signaling cascades in living individuals with DS is undefined. To address this knowledge gap, we employed proteomics approaches to analyze blood samples from 263 individuals, 165 of them with DS, leading to the identification of dozens of proteins that are consistently deregulated by T21. Most prominent among these proteins are numerous factors involved in immune control, the complement cascade, and growth factor signaling. Importantly, people with DS display higher levels of many pro-inflammatory cytokines (e.g. IL-6, MCP-1, IL-22, TNF-α) and pronounced complement consumption, resembling changes seen in type I interferonopathies and other autoinflammatory conditions. Therefore, these results are consistent with the hypothesis that increased interferon signaling caused by T21 leads to chronic immune dysregulation, and justify investigations to define the therapeutic value of immune-modulatory strategies in DS.
The genomes of most eukaryotes are composed of genes arranged on the chromosomes without regard to function, with each gene transcribed from a promoter at its 5 end. However, the genome of the free-living nematode Caenorhabditis elegans contains numerous polycistronic clusters similar to bacterial operons in which the genes are transcribed sequentially from a single promoter at the 5 end of the cluster. The resulting polycistronic pre-mRNAs are processed into monocistronic mRNAs by conventional 3 end formation, cleavage, and polyadenylation, accompanied by trans-splicing with a specialized spliced leader (SL), SL2. To determine whether this mode of gene organization and expression, apparently unique among the animals, occurs in other species, we have investigated genes in a distantly related free-living rhabditid nematode in the genus Dolichorhabditis (strain CEW1). We have identified both SL1 and SL2 RNAs in this species. In addition, we have sequenced a Dolichorhabditis genomic region containing a gene cluster with all of the characteristics of the C. elegans operons. We show that the downstream gene is trans-spliced to SL2. We also present evidence that suggests that these two genes are also clustered in the C. elegans and Caenorhabditis briggsae genomes. Thus, it appears that the arrangement of genes in operons pre-dates the divergence of the genus Caenorhabditis from the other genera in the family Rhabditidae, and may be more widespread than is currently appreciated.In bacteria and archaea, the genomes are primarily organized in arrays of genes whose products have related functions. These gene clusters, called operons, are cotranscribed from an upstream promoter and the resulting polycistronic mRNA is translated by ribosomes initiating at or near the 5Ј end of the RNA. These operons serve to efficiently coregulate proteins that function together. In contrast, eukaryotes have genomes composed of genes arranged apparently at random, with each transcribed by a promoter at its 5Ј end. However, in a group of primitive eukaryotic protozoa, the trypanosomes, genes are transcribed polycistronically (1-3). In this case, the polycistronic pre-mRNA is processed by 3Ј end formation and trans-splicing to create conventional eukaryotic monocistronic mRNAs. The trans-splicing reaction that creates the 5Ј ends of the mRNAs is related to the cis-splicing of higher eukaryotes; it proceeds through a 2Ј-5Ј branched intermediate, the splice sites have the same consensus sequences, and it is catalyzed by some of the same small nuclear ribonucleoprotein particles (4, 5).Trans-splicing was first discovered in trypanosomatids (4, 6), and later shown to occur also in Caenorhabditis elegans and other nematodes (ref. 7; reviewed in refs. 8 and 9), in Euglena (10), and in flatworms (11,12). In contrast to trypanosomes, in which only trans-splicing is present, the genes in the other organisms also contain cis-spliced introns. It was presumed that these genes were monocistronic and arranged randomly on the chromosomes as in other...
Streptococcus iniae was recovered from diseased rainbow trout (Oncorhynchus mykiss, Walbaum) previously vaccinated against streptococcosis. PCR and serological methods indicate the presence of a new serotype in the diseased fish.The fish pathogen Streptococcus iniae (11) is endemic in various parts of the world, including Israel (6) and North America (10). S. iniae infection of rainbow trout (Oncorhynchus mykiss) produces a disease which substantially affects the brain, with only minor pathological changes in other organs (4). Recently, S. iniae has been isolated from diseased humans suffering from cellulitis, meningitis, and bacteremias, indicating a threat to public health (16).A specific S. iniae vaccine became available in 1995. From 1995 to 1997, all Israeli trout farms in the Upper Galilee, (which share water reservoirs) routinely vaccinated their entire stocks (roughly 3 million fish/year), reducing S. iniae-related mortalities from 50% annually to less than 5% (5). However, massive new outbreaks of the disease were recorded in 1997. Unlike the previous pathological manifestations, diseased fish exhibited multisystem organ involvement and diffuse internal hemorrhages. Brains samples were collected from diseased fish and streaked on Columbia agar base (Difco) supplemented with 5% (vol/vol) defibrinated sheep blood. Beta-hemolytic gram-positive cocci were detected following an incubation of 24 to 48 h at 24°C. Conventional identification schemes (API 20 STREP; BioMerieux SA, Marcy l'Etoie, France) suggested that all isolates (of 100 collected over 24 months) were S. iniae. The new isolates, unlike the previous isolates, were shown to be arginine dehydrolase (ADH) negative. Definitive identification was accomplished by PCR, using S. iniae 16S rDNAspecific primers (Zlotkin et al. [17]), which revealed the 300-bp S. iniae-specific PCR product in all isolates.Six (11)). The rDNA sequence analyses confirmed that the six isolates were S. iniae. All had an identical sequence (GenBank accession no. AF335573) and differed from that of S. iniae ATCC 29178 (GenBank accession no. AF335572) in six bases (99.6% homology). EcoRI and HindIII digests of DNAs extracted from early and recent isolates resulted in identical restriction fragment length polymorphism ribotype patterns (data not shown), indicating that all isolates cluster in the Israeli S. iniae rank (6). The use of additional endonucleases (PvuII or KpnI) did not provide strain-to-strain differentiation (data not shown).The random amplified polymorphic DNA (RAPD) technique ( Fig. 1) was used to distinguish early from recent isolates. Primer p14 (5ЈGATCAAGTCC), previously proven useful for discrimination among group A streptococcol strains, was used (Neeman et al. [8]). All early (1991 to 1995) isolates produced a band with an estimated length of 750 bp. No such band was found in the PCR product of the recent isolates (Fig. 1).Hyperimmune sera for serological differentiation were obtained by three monthly immunizations of rainbow trout with formalin-fixed S. inia...
Ribonuclease P (RNase P) is the ribonucleoprotein endonuclease that processes the 5' ends of precursor tRNAs. Bacterial and eukaryal RNase P RNAs had the same primordial ancestor; however, they were molded differently by evolution. RNase P RNAs of eukaryotes, in contrast to bacterial RNAs, are not catalytically active in vitro without proteins. By comparing the bacterial and eukaryal RNAs, we can begin to understand the transitions made between the RNA and protein-dominated worlds. We report, based on crosslinking studies, that eukaryal RNAs, although catalytically inactive alone, fold into functional forms and specifically bind tRNA even in the absence of proteins. Based on the crosslinking results and crystal structures of bacterial RNAs, we develop a tertiary structure model of the eukaryal RNase P RNA. The eukaryal RNA contains a core structure similar to the bacterial RNA but lacks specific features that in bacterial RNAs contribute to catalysis and global stability of tertiary structure.
Polycistronic pre-mRNAs from Caenorhabditis elegans are processed by 3 end formation of the upstream mRNA and SL2-specific trans-splicing of the downstream mRNA. These processes usually occur within an ∼100-nucleotide region and are mechanistically coupled. In this paper, we report a complex in C. elegans extracts containing the 3 end formation protein CstF-64 and the SL2 snRNP. This complex, immunoprecipitated with ␣CstF-64 antibody, contains SL2 RNA, but not SL1 RNA or other U snRNAs. Using mutational analysis we have been able to uncouple SL2 snRNP function and identity. SL2 RNA with a mutation in stem/loop III is functional in vivo as a trans-splice donor, but fails to splice to SL2-accepting trans-splice sites, suggesting that it has lost its identity as an SL2 snRNP. Importantly, stem/loop III mutations prevent association of SL2 RNA with CstF-64. In contrast, a mutation in stem II that inactivates the SL2 snRNP still permits complex formation with CstF-64. Therefore, SL2 RNA stem/loop III is required for both SL2 identity and formation of a complex containing CstF-64, but not for trans-splicing. These results provide a molecular framework for the coupling of 3 end formation and trans-splicing in the processing of polycistronic pre-mRNAs from C. elegans operons. ). Polycistronic pre-mRNAs from these operons are processed into monocistronic mRNAs by cleavage and polyadenylation at the 3Ј ends of upstream gene mRNAs accompanied by trans-splicing at the 5Ј ends of downstream gene mRNAs. In general, these two processes occur within a 100-nucleotide region (Blumenthal and Steward 1997) and are mechanistically coupled (Kuersten et al. 1997).In C. elegans, 3Ј end formation is dependent on an AAUAAA signal (Kuersten et al. 1997;Liu et al. 2001). It is expected that this sequence is bound by cleavage and polyadenylation specificity factor (CPSF), as it is in mammalian cells (for reviews, see Colgan and Manley 1997;Keller and Minvielle-Sebastia 1997;Zhao et al. 1999). Presumably, C. elegans 3Ј end formation also requires cleavage stimulation factor (CstF), which binds a U-rich or GU-rich sequence downstream of the cleavage site. Homologs of each of the subunits of both mammalian CstF and CPSF are present in the C. elegans genome (C.J. Wilusz and T. Blumenthal, unpubl.).Trans-splicing generates 5Ј ends of mRNAs in trypanosomes and many animals (Murphy et al. 1986;Sutton and Boothroyd 1986;Krause and Hirsh 1987;Rajkovic et al. 1990;Tessier et al. 1991;Stover and Steele 2001;Vandenberghe et al. 2001). A spliced leader (SL) exon is donated to the 5Ј ends of mRNAs by a short RNA donor called SL RNA. The SL RNA exists as a ribonucleoprotein (RNP) particle (Thomas et al. 1988;Van Doren and Hirsh 1988;Maroney et al. 1990;Goncharov et al. 1999) that includes the Sm core proteins (Lerner and Steitz 1979). Unlike the other U snRNPs, which are capable of catalyzing repeated splicing reactions, the SL snRNP is consumed during the trans-splicing reaction.C. elegans possesses two distinct SL RNAs, SL1 RNA (Krause and Hirsh 1987) a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.