DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA 1 – 5 and contain genetic variations associated with diseases and phenotypic traits 6 – 8 . We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis -regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.
Peloponnese has been one of the cradles of the Classical European civilization and an important contributor to the ancient European history. It has also been the subject of a controversy about the ancestry of its population. In a theory hotly debated by scholars for over 170 years, the German historian Jacob Philipp Fallmerayer proposed that the medieval Peloponneseans were totally extinguished by Slavic and Avar invaders and replaced by Slavic settlers during the 6th century CE. Here we use 2.5 million single-nucleotide polymorphisms to investigate the genetic structure of Peloponnesean populations in a sample of 241 individuals originating from all districts of the peninsula and to examine predictions of the theory of replacement of the medieval Peloponneseans by Slavs. We find considerable heterogeneity of Peloponnesean populations exemplified by genetically distinct subpopulations and by gene flow gradients within Peloponnese. By principal component analysis (PCA) and ADMIXTURE analysis the Peloponneseans are clearly distinguishable from the populations of the Slavic homeland and are very similar to Sicilians and Italians. Using a novel method of quantitative analysis of ADMIXTURE output we find that the Slavic ancestry of Peloponnesean subpopulations ranges from 0.2 to 14.4%. Subpopulations considered by Fallmerayer to be Slavic tribes or to have Near Eastern origin, have no significant ancestry of either. This study rejects the theory of extinction of medieval Peloponneseans and illustrates how genetics can clarify important aspects of the history of a human population.
DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA and harbor disease-and phenotypic trait-associated genetic variation. We established high-precision maps of DNase I hypersensitive sites from 733 human biosamples encompassing 439 cell and tissue types and states, and integrated these to precisely delineate and numerically index ~3.6 million DHSs encoded within the human genome, providing a common coordinate system for regulatory DNA. Here we show that the expansive scale of cell and tissue states sampled exposes an unprecedented degree of stereotyped actuation of large sets of elements, signaling the operation of distinct genome-scale regulatory programs. We show further that the complex actuation patterns of individual elements can be captured comprehensively by a simple regulatory vocabulary reflecting their dominant cellular manifestation. This vocabulary, in turn, enables comprehensive and quantitative regulatory annotation of both protein-coding genes and the vast array of well-defined but poorly-characterized non-coding RNA genes. Finally, we show that the combination of high-precision DHSs and regulatory vocabularies markedly concentrate disease-and trait-associated non-coding genetic signals both along the genome and across cellular compartments. Taken together, our results provide a common and extensible coordinate system and vocabulary for human regulatory DNA, and a new global perspective on the architecture of human gene regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.