SUMMARY Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ~1% of all eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ~34% of the ~170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in ChIP-seq peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif “library” (http://cisbp.ccbr.utoronto.ca) can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.
The RNA-Binding Protein DataBase (RBPDB) is a collection of experimental observations of RNA-binding sites, both in vitro and in vivo, manually curated from primary literature. To build RBPDB, we performed a literature search for experimental binding data for all RNA-binding proteins (RBPs) with known RNA-binding domains in four metazoan species (human, mouse, fly and worm). In total, RPBDB contains binding data on 272 RBPs, including 71 that have motifs in position weight matrix format, and 36 sets of sequences of in vivo-bound transcripts from immunoprecipitation experiments. The database is accessible by a web interface which allows browsing by domain or by organism, searching and export of records, and bulk data downloads. Users can also use RBPDB to scan sequences for RBP-binding sites. RBPDB is freely available, without registration at http://rbpdb.ccbr.utoronto.ca/.
BackgroundBrain tumor (BRAT) is a Drosophila member of the TRIM-NHL protein family. This family is conserved among metazoans and its members function as post-transcriptional regulators. BRAT was thought to be recruited to mRNAs indirectly through interaction with the RNA-binding protein Pumilio (PUM). However, it has recently been demonstrated that BRAT directly binds to RNA. The precise sequence recognized by BRAT, the extent of BRAT-mediated regulation, and the exact roles of PUM and BRAT in post-transcriptional regulation are unknown.ResultsGenome-wide identification of transcripts associated with BRAT or with PUM in Drosophila embryos shows that they bind largely non-overlapping sets of mRNAs. BRAT binds mRNAs that encode proteins associated with a variety of functions, many of which are distinct from those implemented by PUM-associated transcripts. Computational analysis of in vitro and in vivo data identified a novel RNA motif recognized by BRAT that confers BRAT-mediated regulation in tissue culture cells. The regulatory status of BRAT-associated mRNAs suggests a prominent role for BRAT in post-transcriptional regulation, including a previously unidentified role in transcript degradation. Transcriptomic analysis of embryos lacking functional BRAT reveals an important role in mediating the decay of hundreds of maternal mRNAs during the maternal-to-zygotic transition.ConclusionsOur results represent the first genome-wide analysis of the mRNAs associated with a TRIM-NHL protein and the first identification of an RNA motif bound by this protein family. BRAT is a prominent post-transcriptional regulator in the early embryo through mechanisms that are largely independent of PUM.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0659-4) contains supplementary material, which is available to authorized users.
TRIM-NHL proteins are conserved among metazoans and control cell fate decisions in various stem cell linages. The Drosophila TRIM-NHL protein Brain tumor (Brat) directs differentiation of neuronal stem cells by suppressing self-renewal factors. Brat is an RNA-binding protein and functions as a translational repressor. However, it is unknown which RNAs Brat regulates and how RNA-binding specificity is achieved. Using RNA immunoprecipitation and RNAcompete, we identify Brat-bound mRNAs in Drosophila embryos and define consensus binding motifs for Brat as well as a number of additional TRIM-NHL proteins, indicating that TRIM-NHL proteins are conserved, sequence-specific RNA-binding proteins. We demonstrate that Brat-mediated repression and direct RNA-binding depend on the identified motif and show that binding of the localization factor Miranda to the Brat-NHL domain inhibits Brat activity. Finally, to unravel the sequence specificity of the NHL domain, we crystallize the Brat-NHL domain in complex with RNA and present a high-resolution protein-RNA structure of this fold.
The development of malaria parasites throughout their various life cycle stages is coordinated by changes in gene expression. We previously showed that the three-dimensional organization of the Plasmodium falciparum genome is strongly associated with gene expression during its replication cycle inside red blood cells. Here, we analyze genome organization in the P. falciparum and P. vivax transmission stages. Major changes occur in the localization and interactions of genes involved in pathogenesis and immune evasion, host cell invasion, sexual differentiation, and master regulation of gene expression. Furthermore, we observe reorganization of subtelomeric heterochromatin around genes involved in host cell remodeling. Depletion of heterochromatin protein 1 (PfHP1) resulted in loss of interactions between virulence genes, confirming that PfHP1 is essential for maintenance of the repressive center. Our results suggest that the three-dimensional genome structure of human malaria parasites is strongly connected with transcriptional activity of specific gene families throughout the life cycle.
The budding yeast Saccharomyces cerevisiae is a long-standing model for the three-dimensional organization of eukaryotic genomes. However, even in this well-studied model, it is unclear how homolog pairing in diploids or environmental conditions influence overall genome organization. Here, we performed high-throughput chromosome conformation capture on diverged Saccharomyces hybrid diploids to obtain the first global view of chromosome conformation in diploid yeasts. After controlling for the Rabl-like orientation using a polymer model, we observe significant homolog proximity that increases in saturated culture conditions. Surprisingly, we observe a localized increase in homologous interactions between the HAS1-TDA1 alleles specifically under galactose induction and saturated growth. This pairing is accompanied by relocalization to the nuclear periphery and requires Nup2, suggesting a role for nuclear pore complexes. Together, these results reveal that the diploid yeast genome has a dynamic and complex 3D organization.DOI: http://dx.doi.org/10.7554/eLife.23623.001
RNA-binding proteins (RBPs) are important regulators of eukaryotic gene expression. Genomes typically encode dozens to hundreds of proteins containing RNA-binding domains, which collectively recognize diverse RNA sequences and structures. Recent advances in high-throughput methods for assaying the targets of RBPs in vitro and in vivo allow large-scale derivation of RNA-binding motifs as well as determination of RNA–protein interactions in living cells. In parallel, many computational methods have been developed to analyze and interpret these data. The interplay between RNA secondary structure and RBP binding has also been a growing theme. Integrating RNA–protein interaction data with observations of post-transcriptional regulation will enhance our understanding of the roles of these important proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.