Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.
The exploration of copy-number variation (CNV), notably of somatic cells, is an understudied aspect of genome biology. Any differences in the genetic makeup between twins derived from the same zygote represent an irrefutable example of somatic mosaicism. We studied 19 pairs of monozygotic twins with either concordant or discordant phenotype by using two platforms for genome-wide CNV analyses and showed that CNVs exist within pairs in both groups. These findings have an impact on our views of genotypic and phenotypic diversity in monozygotic twins and suggest that CNV analysis in phenotypically discordant monozygotic twins may provide a powerful tool for identifying disease-predisposition loci. Our results also imply that caution should be exercised when interpreting disease causality of de novo CNVs found in patients based on analysis of a single tissue in routine disease-related DNA diagnostics.
Summary Most human transcription factors bind a small subset of potential genomic sites and often use different subsets in different cell types. To identify mechanisms that govern cell type-specific transcription factor binding, we used an integrative approach to study estrogen receptor α (ER). We found that ER exhibits two distinct modes of binding. Shared sites, bound in multiple cell types, are characterized by high affinity estrogen response elements (EREs), inaccessible chromatin and a lack of DNA methylation, while cell-specific sites are characterized by a lack of EREs, co-occurrence with other transcription factors and cell type-specific chromatin accessibility and DNA methylation. These observations enabled accurate quantitative models of ER binding that suggest tethering of ER to one-third of cell-specific sites. The distinct properties of cell-specific binding were also observed with glucocorticoid receptor and for ER in primary mouse tissues, representing an elegant genomic encoding scheme for generating cell type-specific gene regulation.
Two major types of genetic variation are known: single nucleotide polymorphisms (SNPs), and a more recently discovered structural variation, involving changes in copy number (CNVs) of kilobase- to megabase-sized chromosomal segments. It is unknown whether CNVs arise in somatic cells, but it is, however, generally assumed that normal cells are genetically identical. We tested 34 tissue samples from three subjects and, having analyzed for each tissue < or =10(-6) of all cells expected in an adult human, we observed at least six CNVs, affecting a single organ or one or more tissues of the same subject. The CNVs ranged from 82 to 176 kb, often encompassing known genes, potentially affecting gene function. Our results indicate that humans are commonly affected by somatic mosaicism for stochastic CNVs, which occur in a substantial fraction of cells. The majority of described CNVs were previously shown to be polymorphic between unrelated subjects, suggesting that some CNVs previously reported as germline might represent somatic events, since in most studies of this kind, only one tissue is typically examined and analysis of parents for the studied subjects is not routinely performed. A considerable number of human phenotypes are a consequence of a somatic process. Thus, our conclusions will be important for the delineation of genetic factors behind these phenotypes. Consequently, biobanks should consider sampling multiple tissues to better address mosaicism in the studies of somatic disorders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.