Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.DNA sequencing and, more recently, massively parallel DNA sequencing 1-4 has had a profound impact on research and medicine. The reductions in cost and time for generating DNA sequence have resulted in a range of new sequencing applications in cancer 5,6 , human genetics 7 , infectious diseases 8 and the study of personal genomes 9-11 , as well as in fields as diverse as ecology 12,13 and the study of ancient DNA 14,15 . Although de novo sequencing costs have dropped substantially, there is a desire to continue to drop the cost of sequencing at an exponential rate consistent with the semiconductor industry's Moore's Law 16 as well as to provide lower cost, faster and more portable devices. This has been operationalized by the desire to reach the $1,000 genome 17 .To date, DNA sequencing has been limited by its requirement for imaging technology, electromagnetic intermediates (either X-rays 18 , or light 19 ) and specialized nucleotides or other reagents 20 . To overcome these limitations and further democratize the practice of sequencing, a paradigm shift based on non-optical sequencing on newly developed integrated circuits was pursued. Owing to its scalability and its low power requirement, CMOS processes are dominant in modern integrated circuit manufacturing 21 . The ubiquitous nature of computers, digital cameras and mobile phones has been made possible by the low-cost production of integrated circuits in CMOS.Leveraging advances in the imaging field-which has produced large, fast arrays for photonic imaging 22 -we sought a suitable electronic sensor for the construction of an integrated circuit to detect the hydrogen ions that would be released by DNA polymerase 23 during sequencing by synthesis, as opposed to a sensor designed for the detection of photons. Although a variety ...
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.
The sterol regulatory element-binding protein (SREBP) family member SREBP1 is a critical transcriptional regulator of cholesterol and fatty acid metabolism and has been implicated in insulin resistance, diabetes, and other diet-related diseases. We globally identified the promoters occupied by SREBP1 and its binding partners NFY and SP1 in a human hepatocyte cell line using chromatin immunoprecipitation combined with genome tiling arrays (ChIP-chip). We find that SREBP1 occupies the promoters of 1,141 target genes involved in diverse biological pathways, including novel targets with roles in lipid metabolism and insulin signaling. We also identify a conserved SREBP1 DNA-binding motif in SREBP1 target promoters, and we demonstrate that many SREBP1 target genes are transcriptionally activated by treatment with insulin and glucose using gene expression microarrays. Finally, we show that SREBP1 cooperates extensively with NFY and SP1 throughout the genome and that unique combinations of these factors target distinct functional pathways. Our results provide insight into the regulatory circuitry in which SREBP1 and its network partners coordinate a complex transcriptional response in the liver with cues from the diet.
PPARGC1A is a transcriptional coactivator that binds to and coactivates a variety of transcription factors (TFs) to regulate the expression of target genes. PPARGC1A plays a pivotal role in regulating energy metabolism and has been implicated in several human diseases, most notably type II diabetes. Previous studies have focused on the interplay between PPARGC1A and individual TFs, but little is known about how PPARGC1A combines with all of its partners across the genome to regulate transcriptional dynamics. In this study, we describe a core PPARGC1A transcriptional regulatory network operating in HepG2 cells treated with forskolin. We first mapped the genome-wide binding sites of PPARGC1A using chromatin-IP followed by high-throughput sequencing (ChIP-seq) and uncovered overrepresented DNA sequence motifs corresponding to known and novel PPARGC1A network partners. We then profiled six of these site-specific TF partners using ChIP-seq and examined their network connectivity and combinatorial binding patterns with PPARGC1A. Our analysis revealed extensive overlap of targets including a novel link between PPARGC1A and HSF1, a TF regulating the conserved heat shock response pathway that is misregulated in diabetes. Importantly, we found that different combinations of TFs bound to distinct functional sets of genes, thereby helping to reveal the combinatorial regulatory code for metabolic and other cellular processes. In addition, the different TFs often bound near the promoters and coding regions of each other's genes suggesting an intricate network of interdependent regulation. Overall, our study provides an important framework for understanding the systems-level control of metabolic gene expression in humans.
Studies of the proteome would benefit greatly from methods to directly sequence and digitally quantify proteins and detect posttranslational modifications with single-molecule sensitivity. Here, we demonstrate single-molecule protein sequencing using a dynamic approach in which single peptides are probed in real time by a mixture of dye-labeled N-terminal amino acid recognizers and simultaneously cleaved by aminopeptidases. We annotate amino acids and identify the peptide sequence by measuring fluorescence intensity, lifetime, and binding kinetics on an integrated semiconductor chip. Our results demonstrate the kinetic principles that allow recognizers to identify multiple amino acids in an information-rich manner that enables discrimination of single amino acid substitutions and posttranslational modifications. With further development, we anticipate that this approach will offer a sensitive, scalable, and accessible platform for single-molecule proteomic studies and applications.
Human embryonic stem cells (hESCs) can be induced and differentiated to form a relatively homogeneous population of neuronal precursors in vitro. We have used this system to screen for genes necessary for neural lineage development by using a pooled human short hairpin RNA (shRNA) library screen and massively parallel sequencing. We confirmed known genes and identified several unpredicted genes with interrelated functions that were specifically required for the formation or survival of neuronal progenitor cells without interfering with the self-renewal capacity of undifferentiated hESCs. Among these are several genes that have been implicated in various neurodevelopmental disorders (i.e., brain malformations, mental retardation, and autism). Unexpectedly, a set of genes mutated in late-onset neurodegenerative disorders and with roles in the formation of RNA granules were also found to interfere with neuronal progenitor cell formation, suggesting their functional relevance in early neurogenesis. This study advances the feasibility and utility of using pooled shRNA libraries in combination with next-generation sequencing for a high-throughput, unbiased functional genomic screen. Our approach can also be used with patient-specific human-induced pluripotent stem cell-derived neural models to obtain unparalleled insights into developmental and degenerative processes in neurological or neuropsychiatric disorders with monogenic or complex inheritance. O ur general aim is to identify genes and pathways of early neural differentiation that are relevant for the development and function of the human nervous system. In vitro differentiation of human embryonic stem cells (hESCs) yielding developmentally competent neuronal progenitors is an attractive model for these studies and several approaches for this have been developed, including those by our group (1).Large-scale loss-of-function analysis by RNA interference (RNAi) using small hairpin (shRNA) or small interfering RNA (siRNA) -mediated knockdown of genes is a powerful screening approach in mammalian cells (2-7). ShRNAs provide the opportunity for long-term silencing by stable infection of cells maintained under selection pressure. Furthermore, pooled libraries of shRNAs can be efficiently used instead of arrayed individual clones. Screening experiments can be designed for detection of shRNAs that produce a desired effect in a cell (positive screens) or for the depletion of shRNA species that silence a necessary gene (dropout screens). The abundance of shRNAs has been measured by using barcodes and array hybridization (8, 9), but this procedure is subject to cross-hybridization and nonlinear responses. Most positive screens have been limited to studies of cellular proliferation or survival because it is easier to analyze the recovered enriched shRNAs (10). For dropout screens it is critical to accurately measure the abundance of shRNAs that are retained in cells at the end of the experiment to determine which shRNAs were depleted (8). In part because of this challenge, ...
Proteins are the main structural and functional components of cells, and their dynamic regulation and post-translational modifications (PTMs) underlie cellular phenotypes. Next-generation DNA sequencing technologies have revolutionized our understanding of heredity and gene regulation, but the complex and dynamic states of cells are not fully captured by the genome and transcriptome. Sensitive measurements of the proteome are needed to fully understand biological processes and changes to the proteome that occur in disease states. Studies of the proteome would benefit greatly from methods to directly sequence and digitally quantify proteins and detect PTMs with single-molecule sensitivity and precision. However current methods for studying the proteome lag behind DNA sequencing in throughput, sensitivity, and accessibility due to the complexity and dynamic range of the proteome, the chemical properties of proteins, and the inability to amplify proteins. Here, we demonstrate single-molecule protein sequencing on a compact benchtop instrument using a dynamic sequencing by stepwise degradation approach in which single surface-immobilized peptide molecules are probed in real-time by a mixture of dye-labeled N-terminal amino acid recognizers and simultaneously cleaved by aminopeptidases. By measuring fluorescence intensity, lifetime, and binding kinetics of recognizers on an integrated semiconductor chip we are able to annotate amino acids and identify the peptide sequence. We describe the expansion of the number of recognizable amino acids and demonstrate the kinetic principles that allow individual recognizers to identify multiple amino acids in a highly information-rich manner that is sensitive to adjacent residues. Furthermore, we demonstrate that our method is compatible with both synthetic and natural peptides, and capable of detecting single amino acid changes and PTMs. We anticipate that with further development our protein sequencing method will offer a sensitive, scalable, and accessible platform for studies of the proteome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.