The pan-cancer analysis of whole genomes The expansion of whole-genome sequencing studies from individual ICGC and TCGA working groups presented the opportunity to undertake a meta-analysis of genomic features across tumour types. To achieve this, the PCAWG Consortium was established. A Technical Working Group implemented the informatics analyses by aggregating the raw sequencing data from different working groups that studied individual tumour types, aligning the sequences to the human genome and delivering a set of high-quality somatic mutation calls for downstream analysis (Extended Data Fig. 1). Given the recent meta-analysis
Genetic diversity arises from recombination and de novo mutation (DNM). Using a combination of microarray genotype and whole-genome sequence data on parent-child pairs, we identified 4,531,535 crossover recombinations and 200,435 DNMs. The resulting genetic map has a resolution of 682 base pairs. Crossovers exhibit a mutagenic effect, with overrepresentation of DNMs within 1 kilobase of crossovers in males and females. In females, a higher mutation rate is observed up to 40 kilobases from crossovers, particularly for complex crossovers, which increase with maternal age. We identified 35 loci associated with the recombination rate or the location of crossovers, demonstrating extensive genetic control of meiotic recombination, and our results highlight genes linked to the formation of the synaptonemal complex as determinants of crossovers.
Long-read sequencing (LRS) promises to improve characterization of structural variants (SVs), a major source of genetic diversity. We generated LRS data on 3,622 Icelanders using Oxford Nanopore Technologies, and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions), spanning a median of 10 Mb per haploid genome. We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association with a rare (AF = 0.037%) deletion of the first exon of PCSK9. Carriers of this deletion have 0.93 mmol/L (1.31 SD) lower LDL cholesterol levels than the population average (p-value = 7.0•10 −20 ). We also discovered an association with a multi-allelic SV inside a large repeat region, contained within single long reads, in an exon of ACAN. Within this repeat region we found 11 alleles that differ in the number of a 57 bp-motif repeat, and observed a linear relationship (0.016 SD per motif inserted, p = 6.2•10 −18 ) between the number of repeats carried and height. These results show that SVs can be accurately characterized at population scale using long read sequence data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.Human sequence diversity is partially due to structural variants 1 (SVs); genomic rearrangements affecting at least 50 bp of sequence in forms of insertions, deletions, inversions, or translocations. The number of SVs carried by each individual is less than the number of single nucleotide polymorphisms (SNPs) and short (< 50 bp) insertions and deletions (indels), but their greater size makes them more likely to have a functional role 2 , as evident by their disproportionately large impact on diseases and other traits 2,3 .Extensive characterization of three trios sequenced using several technologies 4 and an annotated set based on one sample (HG002) 5 indicate that humans carry 23-31 thousand SVs .
Introduction BRCA1 or BRCA2 germline mutations increase the risk of developing breast cancer. Tumour cells from germline mutation carriers have frequently lost the wild-type allele. This is predicted to result in genomic instability where cell survival depends upon dysfunctional checkpoint mechanisms. Tumorigenic potential could then be acquired through further genomic alterations. Surprisingly, somatic BRCA mutations are not found in sporadic breast tumours. BRCA1 methylation has been shown to occur in sporadic breast tumours and to be associated with reduced gene expression. We examined the frequency of BRCA1 methylation in 143 primary sporadic breast tumours along with BRCA1 copy number alterations and tumour phenotype.
Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.
Preeclampsia is a serious complication of pregnancy, affecting both maternal and fetal health. In genome-wide association meta-analysis of European and Central Asian mothers, we identify sequence variants that associate with preeclampsia in the maternal genome at ZNF831/20q13 and FTO/16q12. These are previously established variants for blood pressure (BP) and the FTO variant has also been associated with body mass index (BMI). Further analysis of BP variants establishes that variants at MECOM/3q26, FGF5/4q21 and SH2B3/12q24 also associate with preeclampsia through the maternal genome. We further show that a polygenic risk score for hypertension associates with preeclampsia. However, comparison with gestational hypertension indicates that additional factors modify the risk of preeclampsia.
Introduction Germline mutations in the BRCA1 and BRCA2 genes account for a considerable fraction of familial predisposition to breast cancer. Somatic mutations in BRCA1 and BRCA2 have not been found and the involvement of these genes in sporadic tumour development therefore remains unclear.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.