RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.
To initiate studies on how protein-protein interaction (or “interactome”) networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico , the current version of the Worm Interactome (WI5) map contains ∼5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
Despite the successes of genomics, little is known about how genetic information produces complex organisms. A look at the crucial functional elements of fly and worm genomes could change that.
A key challenge of functional genomics today is to generate well-annotated data sets that can be interpreted across different platforms and technologies. Large-scale functional genomics data often fail to connect to standard experimental approaches of gene characterization in individual laboratories. Furthermore, a lack of universal annotation standards for phenotypic data sets makes it difficult to compare different screening approaches. Here we address this problem in a screen designed to identify all genes required for the first two rounds of cell division in the Caenorhabditis elegans embryo. We used RNA-mediated interference to target 98% of all genes predicted in the C. elegans genome in combination with differential interference contrast time-lapse microscopy. Through systematic annotation of the resulting movies, we developed a phenotypic profiling system, which shows high correlation with cellular processes and biochemical pathways, thus enabling us to predict new functions for previously uncharacterized genes.
Despite the prominence of Caenorhabditis elegans as a major developmental and genetic model system, its phylogenetic relationship to its closest relatives has not been resolved. Resolution of these relationships is necessary for studying the steps that underlie life history, genomic, and morphological evolution of this important system. By using data from five different nuclear genes from 10 Caenorhabditis species currently in culture, we find a well resolved phylogeny that reveals three striking patterns in the evolution of this animal group: (i) Hermaphroditism has evolved independently in C. elegans and its close relative Caenorhabditis briggsae; (ii) there is a large degree of intron turnover within Caenorhabditis, and intron losses are much more frequent than intron gains; and (iii) despite the lack of marked morphological diversity, more genetic disparity is present within this one genus than has occurred within all vertebrates.C aenorhabditis elegans is an important model system that allows great depth of study into how the genome is translated into a developing, functioning animal (1). To generalize from this model, a phylogenetic context and information about related species are essential. The genome of a close relative, Caenorhabditis briggsae, was recently sequenced, providing an important comparative genomics tool for annotating the C. elegans genome (2). However, genome comparisons for multiple species that are closely related can provide substantially more analytical power, as demonstrated recently by genome comparisons among several closely related yeast species (3). A well resolved phylogeny for closely related species provides the basis for selecting appropriate representatives for such comparisons, for distinguishing orthologous from paralogous genes, and for distinguishing ancestral versus derived states for characters (4).Comparisons among phylogenetically closely related species, as opposed to comparisons among distantly related groups, are more likely to reveal finer detail about the steps that underlie life history, genomic, and morphological evolution. For example, in the absence of a phylogeny that includes additional closely related species, a feature that is actually convergent between two species may appear to be homologous, as we show below for the case of hermaphroditic reproduction. Also, the time and the frequency at which evolutionary events occurred, like the loss and gain of introns, may be obscured by comparing distantly related species or anciently duplicated genes. We therefore aimed to resolve the phylogenetic relationships of all Caenorhabditis species currently in culture by using gene sequence data.Previous analyses with morphological (5) and small subunit (SSU) rRNA gene data (6) supported a monophyletic group called the Elegans group, consisting of C. elegans, C. briggsae, Caenorhabditis remanei, and an undescribed Caenorhabditis species (strain CB5161). However, the relationships of the Elegans group species to each other remained unresolved in these studies. Other mo...
Three-prime untranslated regions (3′UTRs) of metazoan messenger RNAs (mRNAs) contain numerous regulatory elements, yet remain largely uncharacterized. Using polyA capture, 3′ rapid amplification of complementary DNA (cDNA) ends, full-length cDNAs, and RNA-seq, we defined ∼26,000 distinct 3′UTRs in Caenorhabditis elegans for ∼85% of the 18,328 experimentally supported protein-coding genes and revised ∼40% of gene models. Alternative 3′UTR isoforms are frequent, often differentially expressed during development. Average 3′UTR length decreases with animal age. Surprisingly, no polyadenylation signal (PAS) was detected for 13% of polyadenylation sites, predominantly among shorter alternative isoforms. Trans-spliced (versus non–trans-spliced) mRNAs possess longer 3′UTRs and frequently contain no PAS or variant PAS. We identified conserved 3′UTR motifs, isoform-specific predicted microRNA target sites, and polyadenylation of most histone genes. Our data reveal a rich complexity of 3′UTRs, both genome-wide and throughout development.
At least 10% of C. elegans genes are predicted miRNA targets, and a number of nematode miRNAs seem to regulate biological processes by targeting functionally related genes. We have also developed and successfully utilized an in vivo system for testing miRNA target predictions in likely endogenous expression domains. The thousands of genome-wide miRNA target predictions for nematodes, humans, and flies are available from the PicTar website and are linked to an accessible graphical network-browsing tool allowing exploration of miRNA target predictions in the context of various functional genomic data resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.