Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry 1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific 4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may Reprints and permissions information is available at http://www.nature.com/reprints.
The timing of puberty is a highly polygenic childhood trait that is epidemiologically associated with various adult diseases. Using 1000 Genomes Project–imputed genotype data in up to ~370,000 women, we identify 389 independent signals (P < 5 × 10−8) for age at menarche, a milestone in female pubertal development. In Icelandic data, these signals explain ~7.4% of the population variance in age at menarche, corresponding to ~25% of the estimated heritability. We implicate ~250 genes via coding variation or associated expression, demonstrating significant enrichment in neural tissues. Rare variants near the imprinted genes MKRN3 and DLK1 were identified, exhibiting large effects when paternally inherited. Mendelian randomization analyses suggest causal inverse associations, independent of body mass index (BMI), between puberty timing and risks for breast and endometrial cancers in women and prostate cancer in men. In aggregate, our findings highlight the complexity of the genetic regulation of puberty timing and support causal links with cancer susceptibility.
US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.
SummaryBackground-Spinal and bulbar muscular atrophy (SBMA) is caused by polyglutamine expansion in the androgen receptor, which results in ligand-dependent toxicity. Animal models have a neuromuscular deficit that is mitigated by androgen-reducing treatment.
Heritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data 1 , but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs 2-5 . It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as overestimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be fully recovered from whole-genome sequence (WGS) data on 21,620 unrelated individuals of European ancestry. We assigned 47.1 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned variation accordingly. The estimated heritability was 0.79 (SE 0.09) for height and 0.40 (SE 0.09) for BMI, consistent with pedigree estimates. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively variants in the MAF range of 0.0001 to 0.1 explained 0.54 (SE 0.05) and 0.51 (SE 0.11) of heritability for height and BMI, respectively. Our results imply that the still missing heritability of complex traits and disease is accounted for by rare variants, in particular those in regions of low LD.
Aims/hypothesis We examined race differences in the association between age at menarche and type 2 diabetes before and after adjustment for adiposity. Methods We analysed baseline and 9-year follow-up data from 8,491 women (n=2,505 African-American, mean age 53.3 years; n=5,986 white, mean age 54.0 years) in the Atherosclerosis Risk in Communities (ARIC) study. Stratifying by race, we used logistic regression to estimate the OR for prevalent diabetes at baseline, and Cox proportional hazard models to estimate the HR for incident diabetes over follow-up according to age at menarche category (8–11, 12, 13, 14 and 15–18 years). Results Adjusting for age and centre, we found that early age at menarche (8–11 vs 13 years) was associated with diabetes for white, but not African-American women in both the prevalent (white OR 1.72, 95% CI 1.32, 2.25; African-American OR 1.13, 95% CI 0.84, 1.51; interaction p = 0.043) and incident models (white HR 1.43, 95% CI 1.08, 1.89; African-American HR 1.20, 95% CI 0.87, 1.67; interaction p=0.527). Adjustment for adiposity and lifestyle confounders attenuated associations for prevalent (white OR 1.41, 95% CI 1.05, 1.89; African-American OR 0.94, 95% CI 0.68, 1.30; interaction p=0.093) and incident diabetes (white HR 1.22, 95% CI 0.92, 1.63; African-American HR 1.11, 95% CI 0.80, 1.56; interaction p=0.554). Conclusions/interpretation Early menarche was associated with type 2 diabetes in white women, and adulthood adiposity attenuated the relationship. We did not find a similar association in African-American women. Our findings suggest that there may be race/ethnic differences in the influence of developmental factors in the aetiology of type 2 diabetes, which merit further investigation.
The cohort design allows investigators to explore the genetic basis of a variety of diseases and traits in a single study while avoiding major weaknesses of the case-control design. Most cohort studies employ multistage cluster sampling with unequal probabilities to conveniently select participants with desired characteristics, and participants from different clusters might be genetically related. Analysis that ignores the complex sampling design can yield biased estimation of the genetic association and inflation of the type I error. Herein, we develop weighted estimators that reflect unequal selection probabilities and differential nonresponse rates, and we derive variance estimators that properly account for the sampling design and the potential relatedness of participants in different sampling units. We compare, both analytically and numerically, the performance of the proposed weighted estimators with unweighted estimators that disregard the sampling design. We demonstrate the usefulness of the proposed methods through analysis of MetaboChip data in the Hispanic Community Health Study/Study of Latinos, which is the largest health study of the Hispanic/Latino population in the United States aimed at identifying risk factors for various diseases and determining the role of genes and environment in the occurrence of diseases. We provide guidelines on the use of weighted and unweighted estimators, as well as the relevant software.
Genome-wide association studies (GWASs) primarily performed in European-ancestry (EA) populations have identified numerous loci associated with body mass index (BMI). However, it is still unclear whether these GWAS loci can be generalized to other ethnic groups, such as African Americans (AAs). Furthermore, the putative functional variant or variants in these loci mostly remain under investigation. The overall lower linkage disequilibrium in AA compared to EA populations provides the opportunity to narrow in or fine-map these BMI-related loci. Therefore, we used the Metabochip to densely genotype and evaluate 21 BMI GWAS loci identified in EA studies in 29,151 AAs from the Population Architecture using Genomics and Epidemiology (PAGE) study. Eight of the 21 loci (SEC16B, TMEM18, ETV5, GNPDA2, TFAP2B, BDNF, FTO, and MC4R) were found to be associated with BMI in AAs at 5.8 × 10(-5). Within seven out of these eight loci, we found that, on average, a substantially smaller number of variants was correlated (r(2) > 0.5) with the most significant SNP in AA than in EA populations (16 versus 55). Conditional analyses revealed GNPDA2 harboring a potential additional independent signal. Moreover, Metabochip-wide discovery analyses revealed two BMI-related loci, BRE (rs116612809, p = 3.6 × 10(-8)) and DHX34 (rs4802349, p = 1.2 × 10(-7)), which were significant when adjustment was made for the total number of SNPs tested across the chip. These results demonstrate that fine mapping in AAs is a powerful approach for both narrowing in on the underlying causal variants in known loci and discovering BMI-related loci.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.