Recently, we reported a method to estimate the proportion of phenotypic variance explained by all SNPs from genome-wide association studies, and estimated that half of the heritability for human height was captured by common SNPs. Here we partition genetic variation for height, body mass index (BMI), von Willebrand factor (vWF) and QT interval (QTi) onto chromosomes and chromosome segments, using 586,898 SNPs genotyped on 11,586 unrelated individuals. We estimate that ~45%, ~17%, ~25% and ~21% of variance in height, BMI, vWF and QTi, respectively, can be explained by considering all autosomal SNPs simultaneously, and a further ~0.5–1% by X-chromosome SNPs for height, BMI and vWF. We show that variance explained by each chromosome for height and QTi is proportional to the total gene length on that chromosome. In genome-wide analyses, common SNPs in or near genes explain more variation than SNPs between genes. We propose a novel approach to estimate variation due to cryptic relatedness and population stratification. Our results provide further evidence that a substantial proportion of heritability is accounted for by causal variants in linkage disequilibrium with common SNPs; that height, BMI and QTi are highly polygenic traits; and that the additive variation explained by a part of the genome is approximately proportional to the total length of DNA contained within genes therein.
Case-parent trios were used in a genome wide association study of cleft lip with/without cleft palate (CL/P). SNPs near two genes not previously associated with CL/P [MAFB: most significant SNP rs13041247, with odds ratio per minor allele OR=0.704; 95%CI=0.635,0.778; p=2.05*10 −11 ; and ABCA4: most significant SNP rs560426, with OR=1.432; 95%CI=1.292,1.587; p=5.70*10 −12 ] and two previously identified regions (chr. 8q24 and IRF6) attained genome wide significance. Stratifying trios into European and Asian ancestry groups revealed differences in statistical significance, although estimated effect sizes were similar. Replication studies from several populations showed confirming evidence, with families of European ancestry giving stronger evidence for markers in 8q24 while Asian families showed stronger evidence for MAFB and ABCA4. Expression studies support a role for MAFB in palate development.Corresponding author: THB (tbeaty@jhsph.edu). NIH Public Access Author ManuscriptNat Genet. Author manuscript; available in PMC 2010 September 17. Published in final edited form as:Nat Genet. 2010 June ; 42(6): 525-529. doi:10.1038/ng.580. NIH-PA Author ManuscriptNIH-PA Author Manuscript NIH-PA Author ManuscriptCleft lip with or without cleft palate (CL/P) is a common human birth defect with documented genetic and environmental risk factors 1 . While CL/P can occur in many Mendelian malformation syndromes, the isolated, non-syndromic form constitutes 70% of all cases2. Evidence for genetic control of CL/P is compelling: recurrence risks are 20-30 times greater than population prevalences3 , 4 and both twin and family studies 5 suggest a major role for genes, Mutations in IRF6 cause VanderWoude syndrome, the most common Mendelian syndrome including CL/P, and markers in IRF6 have repeatedly shown evidence of association with isolated, non-syndromic CL/P 6-9 . An allele disrupting an AP2 binding site near IRF6 showed particularly strong evidence among European CL families, although multiple risk alleles are likely 10 .Birnbaum et al. 11 conducted a case-control genome wide association study (GWAS) in Germany and found significant evidence of association with markers in 8q24.21, and a US case-control GWAS confirmed this region 12 , with rs987525 being the most significant marker in both studies. Here we present a GWAS using a case-parent trio design in a consortium drawing cases from Europe, the US, China, Taiwan, Singapore, Korea and the Philippines. This design has the advantage of being robust to confounding due to population stratification, which is important when cases from diverse populations are combined. ResultsBecause these case-parent trios came from different populations (Table 1), we conducted a principal components analysis (PCA) on all parents to document genetic variation in our consortium (Supplementary Figure 1). Approximately 50% of parents could be classified as Asian and 45% as European, with remaining parents being of African or "other" ancestry (including mixed). Transmission disequilibrium tests...
Discovering the genetic basis of a Mendelian phenotype establishes a causal link between genotype and phenotype, making possible carrier and population screening and direct diagnosis. Such discoveries also contribute to our knowledge of gene function, gene regulation, development, and biological mechanisms that can be used for developing new therapeutics. As of February 2015, 2,937 genes underlying 4,163 Mendelian phenotypes have been discovered, but the genes underlying ∼50% (i.e., 3,152) of all known Mendelian phenotypes are still unknown, and many more Mendelian conditions have yet to be recognized. This is a formidable gap in biomedical knowledge. Accordingly, in December 2011, the NIH established the Centers for Mendelian Genomics (CMGs) to provide the collaborative framework and infrastructure necessary for undertaking large-scale whole-exome sequencing and discovery of the genetic variants responsible for Mendelian phenotypes. In partnership with 529 investigators from 261 institutions in 36 countries, the CMGs assessed 18,863 samples from 8,838 families representing 579 known and 470 novel Mendelian phenotypes as of January 2015. This collaborative effort has identified 956 genes, including 375 not previously associated with human health, that underlie a Mendelian phenotype. These results provide insight into study design and analytical strategies, identify novel mechanisms of disease, and reveal the extensive clinical variability of Mendelian phenotypes. Discovering the gene underlying every Mendelian phenotype will require tackling challenges such as worldwide ascertainment and phenotypic characterization of families affected by Mendelian conditions, improvement in sequencing and analytical techniques, and pervasive sharing of phenotypic and genomic data among researchers, clinicians, and families.
Clonal mosaicism for large chromosomal anomalies (duplications, deletions and uniparental disomy) was detected using SNP microarray data from over 50,000 subjects recruited for genome-wide association studies. This detection method requires a relatively high frequency of cells (>5–10%) with the same abnormal karyotype (presumably of clonal origin) in the presence of normal cells. The frequency of detectable clonal mosaicism in peripheral blood is low (<0.5%) from birth until 50 years of age, after which it rises rapidly to 2–3% in the elderly. Many of the mosaic anomalies are characteristic of those found in hematological cancers and identify common deleted regions that pinpoint the locations of genes previously associated with hematological cancers. Although only 3% of subjects with detectable clonal mosaicism had any record of hematological cancer prior to DNA sampling, those without a prior diagnosis have an estimated 10-fold higher risk of a subsequent hematological cancer (95% confidence interval = 6–18).
To further dissect the genetic architecture of colorectal cancer (CRC), we performed whole-genome sequencing of 1,439 cases and 720 controls, imputed discovered sequence variants and Haplotype Reference Consortium panel variants into genome-wide association study data, and tested for association in 34,869 cases and 29,051 controls. Findings were followed up in an additional 23,262 cases and 38,296 controls. We discovered a strongly protective 0.3% frequency variant signal at CHD1 . In a combined meta-analysis of 125,478 individuals, we identified 40 new independent signals at P <5×10 −8 , bringing the number of known independent signals for CRC to approximately 100. New signals implicate lower-frequency variants, Krüppel-like factors, Hedgehog signaling, Hippo-YAP signaling, long noncoding RNAs, somatic drivers, and support a role of immune function. Heritability analyses suggest that CRC risk is highly polygenic, and larger, more comprehensive studies enabling rare variant analysis will improve understanding of underlying biology, and impact personalized screening strategies and drug development.
Background Common cancers develop through a multistep process often including inherited susceptibility. Collaboration among multiple institutions, and funding from multiple sources, has allowed the development of an inexpensive genotyping microarray, the OncoArray. The array includes a genome-wide backbone, comprising 230,000 SNPs tagging most common genetic variants, together with dense mapping of known susceptibility regions, rare variants from sequencing experiments, pharmacogenetic markers and cancer related traits. Methods The OncoArray can be genotyped using a novel technology developed by Illumina to facilitate efficient genotyping. The consortium developed standard approaches for selecting SNPs for study, for quality control of markers and for ancestry analysis. The array was genotyped at selected sites and with prespecified replicate samples to permit evaluation of genotyping accuracy among centers and by ethnic background. Results The OncoArray consortium genotyped 447,705 samples. A total of 494,763 SNPs passed quality control steps with a sample success rate of 97% of the samples. Participating sites performed ancestry analysis using a common set of markers and a scoring algorithm based on principal components analysis. Conclusions Results from these analyses will enable researchers to identify new susceptibility loci, perform fine mapping of new or known loci associated with either single or multiple cancers, assess the degree of overlap in cancer causation and pleiotropic effects of loci that have been identified for disease-specific risk, and jointly model genetic, environmental and lifestyle related exposures. Impact Ongoing analyses will shed light on etiology and risk assessment for many types of cancer.
A synthetic genetic system is designed and characterized that allows Escherichia coli to sense and eradicate Pseudomonas aeruginosa, providing a novel antimicrobial strategy that could potentially be applied to fighting infectious pathogens.
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. The recent application of GWAS to clinic-based cohorts has also yielded genetic predictors of clinical outcomes. Regardless of context, the practical utility of this information will NIH Public Access Author ManuscriptCurr Protoc Hum Genet. Author manuscript; available in PMC 2012 January 1. ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. With each new dataset, new realities are discovered about GWAS data and best practices continue to be developed. The Genomics Workgroup of the National Human Genome Research Institute (NHGRI) funded electronic Medical Records and Genomics (eMERGE) network has invested considerable effort in developing strategies for QC of these data. The lessons learned by this group will be valuable for other investigators dealing with large scale genomic datasets. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the eMERGE network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. In this protocol we discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.