The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures of TSPY, DAZ, and RBMY; 42 additional protein-coding genes, mostly from the TSPY gene family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome and mapped available population variation, clinical variants, and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Germline genetic variation contributes to cancer etiology, but self-reported race is not always consistent with genetic ancestry, and samples may not have identifying ancestry information. Here we describe a flexible computational pipeline, PopInf, to visualize principal components analysis output and assign ancestry to samples with unknown genetic ancestry, given a reference population panel of known origins. PopInf is implemented as a reproducible workflow in Snakemake with a tutorial on GitHub. We provide a pre-processed reference population panel that can be quickly and efficiently implemented in cancer genetics studies. We ran PopInf on TCGA liver cancer data and identify discrepancies between reported race and inferred genetic ancestry. Significance. The PopInf pipeline facilitates visualization and identification of genetic ancestry across samples, so that this ancestry can be accounted for in studies of disease risk. All code and a tutorial are available on Github:
27Germline genetic variation contributes to cancer etiology, but self-reported race is not 28 always consistent with genetic ancestry, and samples may not have identifying ancestry 29 information. Here we describe a flexible computational pipeline, PopInf, to visualize principal 30 components analysis output and assign ancestry to samples with unknown genetic ancestry, 31given a reference population panel of known origins. PopInf is implemented as a reproducible 32 workflow in Snakemake with a tutorial on GitHub. We provide a pre-processed reference 33 population panel that can be quickly and efficiently implemented in cancer genetics studies. We 34 ran PopInf on TCGA liver cancer data and identify discrepancies between reported race and 35 inferred genetic ancestry. Significance. The PopInf pipeline facilitates visualization and 36 identification of genetic ancestry across samples, so that this ancestry can be accounted for in 37 studies of disease risk. All code and a tutorial are available on Github: 38 https://github.com/SexChrLab/PopInf. 39 40 Keywords: population ancestry, principal components analysis, visualization, computational 41 pipeline, cancer GWAS 42 43 44 45 46 47 48
Objectives: The aim of this study was to characterize the genetic relationships within and among four neighboring ethnolinguistic groups in northern Kenya in light of cultural relationships to understand the extent to which geography and culture shape patterns of genetic variation. Materials and methods:We collected DNA and demographic information pertaining to aspects of social identity and heritage from 572 individuals across the Turkana, Samburu, Waso Borana, and Rendille of northern Kenya. We sampled individuals across a total of nine clans from these four groups and, additionally, three territorial sections within the Turkana and successfully genotyped 376 individuals.
Background Neanderthal introgressed DNA has been linked to different normal and disease traits including immunity and metabolism—two important functions that are altered in liver cancer. However, there is limited understanding of the relationship between Neanderthal introgression and liver cancer risk. The aim of this study was to investigate the relationship between Neanderthal introgression and liver cancer risk. Methods Using germline and somatic DNA and tumor RNA from liver cancer patients from The Cancer Genome Atlas, along with ancestry-match germline DNA from unaffected individuals from the 1000 Genomes Resource, and allele specific expression data from normal liver tissue from The Genotype-Tissue Expression project we investigated whether Neanderthal introgression impacts cancer etiology. Using a previously generated set of Neanderthal alleles, we identified Neanderthal introgressed haplotypes. We then tested whether somatic mutations are enriched or depleted on Neanderthal introgressed haplotypes compared to modern haplotypes. We also computationally assessed whether somatic mutations have a functional effect or show evidence of regulating expression of Neanderthal haplotypes. Finally, we compared patterns of Neanderthal introgression in liver cancer patients and the general population. Results We find Neanderthal introgressed haplotypes exhibit an excess of somatic mutations compared to modern haplotypes. Variant Effect Predictor analysis revealed that most of the somatic mutations on these Neanderthal introgressed haplotypes are not functional. We did observe expression differences of Neanderthal alleles between tumor and normal for four genes that also showed a pattern of enrichment of somatic mutations on Neanderthal haplotypes. However, gene expression was similar between liver cancer patients with modern ancestry and liver cancer patients with Neanderthal ancestry at these genes. Provocatively, when analyzing all genes, we find evidence of Neanderthal introgression regulating expression in tumor from liver cancer patients in two genes, ARK1C4 and OAS1. Finally, we find that most genes do not show a difference in the proportion of Neanderthal introgression between liver cancer patients and the general population. Conclusion Our results suggest that Neanderthal introgression provides opportunity for somatic mutations to accumulate, and that some Neanderthal introgression may impact liver cancer risk.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.