RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.
Cellular states are determined by differential expression of the cell’s proteins. The relationship between protein and mRNA expression levels informs about the combined outcomes of translation and protein degradation which are, in addition to transcription and mRNA stability, essential contributors to gene expression regulation. This review summarizes the state of knowledge about large-scale measurements of absolute protein and mRNA expression levels, and the degree of correlation between the two parameters. We summarize the information that can be derived from comparison of protein and mRNA expression levels and discuss how corresponding sequence characteristics suggest modes of regulation.
Cellular imbalances of cholesterol and fatty acid metabolism result in pathological processes, including atherosclerosis and metabolic syndrome. Recent work from our group and others has shown that the intronic microRNAs hsa-miR-33a and hsa-miR-33b are located within the sterol regulatory element-binding protein-2 and -1 genes, respectively, and regulate cholesterol homeostasis in concert with their host genes. Here, we show that miR-33a and -b also regulate genes involved in fatty acid metabolism and insulin signaling. miR-33a and -b target key enzymes involved in the regulation of fatty acid oxidation, including carnitine O-octaniltransferase, carnitine palmitoyltransferase 1A, hydroxyacyl-CoAdehydrogenase, Sirtuin 6 (SIRT6), and AMP kinase subunit-α. Moreover, miR-33a and -b also target the insulin receptor substrate 2, an essential component of the insulin-signaling pathway in the liver. Overexpression of miR-33a and -b reduces both fatty acid oxidation and insulin signaling in hepatic cell lines, whereas inhibition of endogenous miR-33a and -b increases these two metabolic pathways. Together, these data establish that miR-33a and -b regulate pathways controlling three of the risk factors of metabolic syndrome, namely levels of HDL, triglycerides, and insulin signaling, and suggest that inhibitors of miR-33a and -b may be useful in the treatment of this growing health concern.lipid homeostasis | posttranscriptional regulation | cardiovascular disease
We provide a large-scale dataset on absolute protein and matching mRNA concentrations from the human medulloblastoma cell line Daoy. The correlation between mRNA and protein concentrations is significant and positive (Rs=0.46, R2=0.29, P-value<2e16), although non-linear.Out of ∼200 tested sequence features, sequence length, frequency and properties of amino acids, as well as translation initiation-related features are the strongest individual correlates of protein abundance when accounting for variation in mRNA concentration.When integrating mRNA expression data and all sequence features into a non-parametric regression model (Multivariate Adaptive Regression Splines), we were able to explain up to 67% of the variation in protein concentrations. Half of the contributions were attributed to mRNA concentrations, the other half to sequence features relating to regulation of translation and protein degradation. The sequence features are primarily linked to the coding and 3′ untranslated region. To our knowledge, this is the most comprehensive predictive model of human protein concentrations achieved so far.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.