The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
We report the Simons Genome Diversity Project (SGDP) dataset: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioral modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that in other non-Africans.
Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.
Reprogramming of cellular metabolism is a key event during tumorigenesis. Despite being known for decades (Warburg effect), the molecular mechanisms regulating this switch remained unexplored. Here, we identify SIRT6 as a novel tumor suppressor that regulates aerobic glycolysis in cancer cells. Importantly, loss of SIRT6 leads to tumor formation without activation of known oncogenes, while transformed SIRT6-deficient cells display increased glycolysis and tumor growth, suggesting that SIRT6 plays a role in both establishment and maintenance of cancer. Using a conditional SIRT6 allele, we show that SIRT6 deletion in vivo increases the number, size and aggressiveness of tumors. SIRT6 also functions as a novel regulator of ribosome metabolism by co-repressing MYC transcriptional activity. Lastly, SIRT6 is selectively downregulated in several human cancers, and expression levels of SIRT6 predict prognosis and tumor-free survival rates, highlighting SIRT6 as a critical modulator of cancer metabolism. Our studies reveal SIRT6 to be a potent tumor suppressor acting to suppress cancer metabolism.
Summary The aberrant transcription factor EWS-FLI1 drives Ewing sarcoma yet its molecular function is incompletely understood. We find that EWS-FLI1 reprograms gene regulatory circuits in Ewing sarcoma by directly inducing or repressing enhancers. At GGAA repeat elements, which lack evolutionary conservation and regulatory potential in other cell types, EWS-FLI1 multimers induce chromatin opening and create de novo enhancers that physically interact with target promoters. Conversely, EWS-FLI1 inactivates conserved enhancers containing canonical ETS motifs by displacing wild type ETS transcription factors. These divergent chromatin-remodeling patterns repress tumor suppressors and mesenchymal lineage regulators, while activating oncogenes and new potential therapeutic targets, such as the kinase VRK1. Our findings demonstrate how EWS-FLI1 establishes an oncogenic regulatory program governing both tumor survival and differentiation.
We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including SNVs, MNVs, indels, STRs, and CNVs. Of these, CNVs contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree based on binary SNVs and projected the more complex variants onto it, estimating the numbers of mutations for each class. Our phylogeny reveals bursts of extreme expansions in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.
Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases, population genetics applications, and forensic casework. However, STRs have proven problematic to genotype from high-throughput sequencing data. Here, we describe HipSTR, a novel haplotype-based method for robustly genotyping and phasing STRs from Illumina sequencing data and report a genome-wide analysis and validation of de novo STR mutations.
SUMMARY Hundreds of Chromatin Regulators (CRs) control chromatin structure and function by catalyzing and binding histone modifications, yet the rules governing these key processes remain obscure. Here, we present a systematic approach to infer CR function. We developed ChIP-string, a meso-scale assay that combines chromatin immunoprecipitation with a signature readout of 487 representative loci. We applied ChIP-string to screen 145 antibodies, thereby identifying effective reagents, which we used to map the genome-wide binding of 29 CRs in two cell types. We found that specific combinations of CRs co-localize in characteristic patterns at distinct chromatin environments, genes of coherent functions and distal regulatory elements. When comparing between cell types, CRs redistribute to different loci, but maintain their modular and combinatorial associations. Our work provides a multiplex method that substantially enhances the ability to monitor CR binding, presents a large resource of CR maps, and reveals common principles for combinatorial CR function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.