The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.
Estimation of narrow-sense heritability, h(2), from genome-wide SNPs genotyped in unrelated individuals has recently attracted interest and offers several advantages over traditional pedigree-based methods. With the use of this approach, it has been estimated that over half the heritability of human height can be attributed to the ~300,000 SNPs on a genome-wide genotyping array. In comparison, only 5%-10% can be explained by SNPs reaching genome-wide significance. We investigated via simulation the validity of several key assumptions underpinning the mixed-model analysis used in SNP-based h(2) estimation. Although we found that the method is reasonably robust to violations of four key assumptions, it can be highly sensitive to uneven linkage disequilibrium (LD) between SNPs: contributions to h(2) are overestimated from causal variants in regions of high LD and are underestimated in regions of low LD. The overall direction of the bias can be up or down depending on the genetic architecture of the trait, but it can be substantial in realistic scenarios. We propose a modified kinship matrix in which SNPs are weighted according to local LD. We show that this correction greatly reduces the bias and increases the precision of h(2) estimates. We demonstrate the impact of our method on the first seven diseases studied by the Wellcome Trust Case Control Consortium. Our LD adjustment revises downward the h(2) estimate for immune-related diseases, as expected because of high LD in the major-histocompatibility region, but increases it for some nonimmune diseases. To calculate our revised kinship matrix, we developed LDAK, software for computing LD-adjusted kinships.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.