Advanced age-related macular degeneration (AMD) is the leading cause of blindness in the elderly with limited therapeutic options. Here, we report on a study of >12 million variants including 163,714 directly genotyped, most rare, protein-altering variant. Analyzing 16,144 patients and 17,832 controls, we identify 52 independently associated common and rare variants (P < 5×10–8) distributed across 34 loci. While wet and dry AMD subtypes exhibit predominantly shared genetics, we identify the first signal specific to wet AMD, near MMP9 (difference-P = 4.1×10–10). Very rare coding variants (frequency < 0.1%) in CFH, CFI, and TIMP3 suggest causal roles for these genes, as does a splice variant in SLC16A8. Our results support the hypothesis that rare coding variants can pinpoint causal genes within known genetic loci and illustrate that applying the approach systematically to detect new loci requires extremely large sample sizes.
BackgroundGene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model.ResultsWe generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models.ConclusionsWe demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0694-1) contains supplementary material, which is available to authorized users.
Many modern human genomes retain DNA inherited from interbreeding with archaic hominins, such as Neanderthals, yet the influence of this admixture on human traits is largely unknown. We analyzed the contribution of common Neanderthal variants to over 1,000 electronic health record (EHR)-derived phenotypes in ~28,000 adults of European ancestry. We discovered and replicated associations of Neanderthal alleles with neurological, psychiatric, immunological, and dermatological phenotypes. Neanderthal alleles together explain a significant fraction of the variation in risk for depression and skin lesions resulting from sun exposure (actinic keratosis), and individual Neanderthal alleles are significantly associated with specific human phenotypes, including hypercoagulation and tobacco use. Our results establish that archaic admixture influences disease risk in modern humans, provide hypotheses about the effects of hundreds of Neanderthal haplotypes and demonstrate the utility of EHR data in evolutionary analyses.
Prostate cancer is a leading and increasingly prevalent cause of cancer death in men. Whereas family history of disease is one of the strongest prostate cancer risk factors and suggests a hereditary component, the predisposing genetic factors remain unknown. We first showed that KLF6 is a tumor suppressor somatically inactivated in prostate cancer and since then, its functional loss has been further established in prostate cancer cell lines and other human cancers. Wild-type KLF6, but not patient-derived mutants, suppresses cell growth through p53-independent transactivation of p21. Here we show that a germline KLF6 single nucleotide polymorphism, confirmed in a tri-institutional study of 3,411 men, is significantly associated with an increased relative risk of prostate cancer in men, regardless of family history of disease. This prostate cancer-associated allele generates a novel functional SRp40 DNA binding site and increases transcription of three alternatively spliced KLF6 isoforms. The KLF6 variant proteins KLF6-SV1 and KLF6-SV2 are mislocalized to the cytoplasm, antagonize wtKLF6 function, leading to decreased p21 expression and increased cell growth, and are up-regulated in tumor versus normal prostatic tissue. Thus, these results are the first to identify a novel mechanism of selfencoded tumor suppressor gene inactivation and link a relatively common single nucleotide polymorphism to both regulation of alternative splicing and an increased risk in a major human cancer. (Cancer Res 2005; 65(4): 1213-22)
The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on U-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a "kernel" function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I-error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.
Genetic association studies often examine features independently, potentially missing subpopulations with multiple phenotypes that share a single cause. We describe an approach that aggregates phenotypes on the basis of patterns described by Mendelian diseases. We mapped the clinical features of 1204 Mendelian diseases into phenotypes captured from the electronic health record (EHR) and summarized this evidence as phenotype risk scores (PheRSs). In an initial validation, PheRS distinguished cases and controls of five Mendelian diseases. Applying PheRS to 21,701 genotyped individuals uncovered 18 associations between rare variants and phenotypes consistent with Mendelian diseases. In 16 patients, the rare genetic variants were associated with severe outcomes such as organ transplants. PheRS can augment rare-variant interpretation and may identify subsets of patients with distinct genetic causes for common diseases.
Major depressive disorder (MDD) is a common psychiatric disease. Selective serotonin reuptake inhibitors (SSRIs) are an important class of drugs used to treat MDD. However, many patients do not respond adequately to SSRI therapy. We used a pharmacometabolomics-informed pharmacogenomic research strategy to identify citalopram/escitalopram treatment outcome biomarkers. Metabolomic assay of plasma samples from 20 escitalopram remitters and 20 nonremitters showed that glycine was negatively associated with treatment outcome (p=0.0054). That observation was pursued by genotyping tag single nucleotide polymorphisms (SNPs) for genes encoding glycine synthesis and degradation enzymes using 529 DNA samples from SSRI-treated MDD patients. The rs10975641 SNP in the glycine dehydrogenase gene was associated with treatment outcome phenotypes. Rs10975641 was then genotyped and was significant (p=0.02) in DNA from 1245 MDD patients in the STAR*D depression study. These results highlight both a possible role for glycine in SSRI response and the use of pharmacometabolomics to "inform" pharmacogenomics. Major depressive disorder (MDD) is a common psychiatric disorder worldwide (1). The majority of these patients receive antidepressant medications as first-line treatment, but there are large variations in the efficacy of all antidepressants, including the widely prescribed selective serotonin reuptake inhibitors (SSRIs) (2). On average, 40% of patients do not "respond" to these drugs, defined as a 50% or greater reduction in symptoms, and over two thirds do not achieve complete "remission" of symptoms after antidepressant therapy (2,3). Therefore, there is a need to identify biomarkers that might help to predict treatment outcomes prior to antidepressant therapy and might also provide insight into drug response mechanisms.Metabolomics is a rapidly developing discipline that represents an attempt to capture global biochemical events by assaying the metabolome, the total repertoire of small molecules in biological samples, to define metabolomic "signatures" (4,5). The emerging field of pharmacometabolomics is focused on metabolomic signatures for drug exposure and/or efficacy, with the goal of using these signatures to better individualize drug therapy (4,6). Pharmacogenomics shares the goals of pharmacometabolomics but utilizes genomic rather than metabolomic data (7). Many pharmacogenetic studies of antidepressant drugs, particularly SSRIs, have been performed. Those studies have generally focused on polymorphisms in candidate genes, including those encoding the serotonin transporter; a variety of serotonin receptors; enzymes involved in serotonin biosynthesis; and drug metabolizing enzymes specific to the particular SSRI being studied (8-10). However, these candidate gene-based studies, and even recently published genome-wide association studies (GWAS), have failed to provide reliable biomarkers for SSRI treatment outcome (11-13).In the present study, a "pharmacometabolomics-informed pharmacogenomic" research strategy ( ...
SummaryOver the last decade, significant technological breakthroughs have revolutionized human genomic research in the form of genome-wide association studies (GWASs). GWASs have identified thousands of statistically significant genetic variants associated with hundreds of human conditions including many with immunological aetiologies (e.g. multiple sclerosis, ankylosing spondylitis and rheumatoid arthritis). Unfortunately, most GWASs fail to identify clinically significant associations. Identifying biologically significant variants by GWAS also presents a challenge. The GWAS is a phenotype-to-genotype approach. As a complementary/alternative approach to the GWAS, investigators have begun to exploit extensive electronic medical record systems to conduct a genotype-to-phenotype approach when studying human disease -specifically, the phenome-wide association study (PheWAS). Although the PheWAS approach is in its infancy, this method has already demonstrated its capacity to rediscover important genetic associations related to immunological diseases/conditions. Furthermore, PheWAS has the advantage of identifying genetic variants with pleiotropic properties. This is particularly relevant for HLA variants. For example, Phe-WAS results have demonstrated that the HLA-DRB1 variant associated with multiple sclerosis may also be associated with erythematous conditions including rosacea. Likewise, PheWAS has demonstrated that the HLA-B genotype is not only associated with spondylopathies, uveitis, and variability in platelet count, but may also play an important role in other conditions, such as mastoiditis. This review will discuss and compare general PheWAS methodologies, describe both the challenges and advantages of the PheWAS, and provide insight into the potential directions in which PheWAS may lead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.