Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.
Gene expression differs among both individuals and populations and is thought to be a major determinant of phenotypic variation. Although variation and genetic loci responsible for RNA expression levels have been analyzed extensively in human populations1–5, our knowledge is limited regarding the differences in human protein abundance and their genetic basis. Variation in mRNA expression is not a perfect surrogate for protein expression because the latter is influenced by a battery of post-transcriptional regulatory mechanisms, and, empirically, the correlation between protein and mRNA levels is generally modest6,7. Here we used isobaric tandem mass tag (TMT)-based quantitative mass spectrometry to determine relative protein levels of 5953 genes in lymphoblastoid cell lines (LCLs) from 95 diverse individuals genotyped in the HapMap Project8,9. We found that protein levels are heritable molecular phenotypes that exhibit considerable variation between individuals, populations, and sexes. Levels of specific sets of proteins involved in the same biological process co-vary among individuals, indicating that these processes are tightly regulated at the protein level. We identified cis-pQTLs (protein quantitative trait loci), including variants not detected by previous transcriptome studies. This study demonstrates the feasibility of high throughput human proteome quantification which, when integrated with DNA variation and transcriptome information, adds a new dimension to the characterization of gene expression regulation.
Spectral count, defined as the total number of spectra identified for a protein, has gained acceptance as a practical, label-free, semiquantitative measure of protein abundance in proteomic studies. In this review, we discuss issues affecting the performance of spectral counting relative to other label-free methods, as well as its limitations. Possible consequences of modifications, which are commonly applied to raw spectral counts to improve abundance estimations, are considered. The use of spectral counting for different types of quantitation studies is explored and critiqued. Different statistical methods and underlying frameworks that have been applied to spectral count analysis are described and compared, and problem areas that undermine confident statistical analysis are considered. Finally, the issue of accurate estimation of false-discovery rates is addressed and identified as a major current challenge in quantitative proteomics.
Protein phosphorylation events during T cell receptor (TCR) signaling control the formation of complexes among proteins proximal to the TCR, the activation of kinase cascades, and the activation of transcription factors; however, the mode and extent of the influence of phosphorylation in coordinating the diverse phenomena associated with T cell activation are unclear. Therefore, we used the human Jurkat T cell leukemia cell line as a model system and performed large-scale quantitative phosphoproteomic analyses of TCR signaling. We identified 10,665 unique phosphorylation sites, of which 696 showed TCR-responsive changes. In addition, we analyzed broad trends in phosphorylation data sets to uncover underlying mechanisms associated with T cell activation. We found that, upon stimulation of the TCR, phosphorylation events extensively targeted protein modules involved in all of the salient phenomena associated with T cell activation: patterning of surface proteins, endocytosis of the TCR, formation of the F-actin cup, inside-out activation of integrins, polarization of microtubules, production of cytokines, and alternative splicing of messenger RNA. Further, case-by-case analysis of TCR-responsive phosphorylation sites on proteins belonging to relevant functional modules together with network analysis allowed us to deduce that serine-threonine (S-T) phosphorylation modulated protein-protein interactions (PPIs) in a system-wide fashion. We also provide experimental support for this inference by showing that phosphorylation of tubulin on six distinct serine residues abrogated PPIs during the assembly of microtubules. We propose that modulation of PPIs by stimulus-dependent changes in S-T phosphorylation state is a widespread phenomenon applicable to many other signaling systems.
Transcription of the eukaryotic genomes is carried out by three distinct RNA polymerases I, II, and III, whereby each polymerase is thought to independently transcribe a distinct set of genes. To investigate a possible relationship of RNA polymerases II and III, we mapped their in vivo binding sites throughout the human genome by using ChIP-Seq in two different cell lines, GM12878 and K562 cells. Pol III was found to bind near many known genes as well as several previously unidentified target genes. RNA-Seq studies indicate that a majority of the bound genes are expressed, although a subset are not suggestive of stalling by RNA polymerase III. Pol II was found to bind near many known Pol III genes, including tRNA, U6, HVG, hY, 7SK and previously unidentified Pol III target genes. Similarly, in vivo binding studies also reveal that a number of transcription factors normally associated with Pol II transcription, including c-Fos, c-Jun and c-Myc, also tightly associate with most Pol IIItranscribed genes. Inhibition of Pol II activity using α-amanitin reduced expression of a number of Pol III genes (e.g., U6, hY, HVG), suggesting that Pol II plays an important role in regulating their transcription. These results indicate that, contrary to previous expectations, polymerases can often work with one another to globally coordinate gene expression.ChIP-Seq | RNA-Seq | transcription | gene regulation A ll nuclear genes in eukaryotes are transcribed by three RNA polymerases, Pol I, II, and III. Although polymerases are known to share subunits (1) and even a small number of targets (2, 3), they transcribe distinct classes of genes: Pol I transcribes 18S, 28S, and 5.8S rDNA genes. Pol II transcribes protein coding genes and many noncoding RNA genes, and Pol III transcribes three different classes of genes including 5S (Class I), tRNA (Class II), and many small noncoding genes (Class III). The latter include U6, hY, 7SK, and vault (HVG) genes and others. For all three classes of Pol III genes, sequences upstream of the genes have been found to be required for their optimal expression, although the factors that bind upstream of Pol III genes are largely not known (4).In general, transcription by the different polymerases is believed to be largely independent of one another. However, components normally associated with Pol II have been found to be associated with Pol III subunits and Pol IIItranscribed genes. c-Myc, which is normally associated with Pol IItranscribed genes, has been found associated with TFIIIB, a Pol III component (5) and TFIIS, a transcription elongation factor for Pol II, was recently found to bind near Pol III genes in yeast (6). In addition, Pol II has been found to bind upstream and enhance the transcription of several U6 genes (3). However, the extent to which Pol II and its partner proteins coassociate with Pol III and coordinate expression has not been studied on genome scale. This information is valuable for both understanding the functional organization of the human genome and whether basic cellular...
Summary Different trans-acting factors (TF) collaborate and act in concert at distinct loci to perform accurate regulation of their target genes. To date, the co-binding of TF pairs has been investigated in a limited context both in terms of the number of factors within a cell type and across cell types and the extent of combinatorial co-localizations. Here we use a novel approach to analyze TF co-localization within a cell type and across multiple cell lines at an unprecedented level. We extend this approach with large-scale mass spectrometry analysis of immunoprecipitations of 50 TFs. Our combined approach reveals large numbers of interesting and novel TF-TF associations. We observe extensive change in TF co-localizations both within a cell type exposed to different conditions and across multiple cell types. We show distinct functional annotations and properties of different TF co-binding patterns and provide new insights into the complex regulatory landscape of the cell.
Successful treatment of multiple cancer types requires early detection and identification of reliable biomarkers present in specific cancer tissues. To test the feasibility of identifying proteins from archival cancer tissues, we have developed a methodology, termed direct tissue proteomics (DTP), which can be used to identify proteins directly from formalin-fixed paraffin-embedded prostate cancer tissue samples. Using minute prostate biopsy sections, we demonstrate the identification of 428 prostate-expressed proteins using the shotgun method. Because the DTP method is not quantitative, we employed the absolute quantification method and demonstrate picogram level quantification of prostate-specific antigen. In depth bioinformatics analysis of these expressed proteins affords the categorization of metabolic pathways that may be important for distinct stages of prostate carcinogenesis. Furthermore, we validate Wnt-3 as an upregulated protein in cancerous prostate cells by immunohistochemistry. We propose that this general strategy provides a roadmap for successful identification of critical molecular targets of multiple cancer types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.