For many traits, including susceptibility to common diseases in humans, causal loci uncovered by genetic mapping studies explain only a minority of the heritable contribution to trait variation. Multiple explanations for this “missing heritability” have been proposed1. Here we use a large cross between two yeast strains to accurately estimate different sources of heritable variation for 46 quantitative traits and to detect underlying loci with high statistical power. We find that the detected loci explain nearly the entire additive contribution to heritable variation for the traits studied. We also show that the contribution to heritability of gene-gene interactions varies among traits, from near zero to approximately 50%. Detected two-locus interactions explain only a minority of this contribution. These results substantially advance our understanding of the missing heritability problem and have important implications for future studies of complex and quantitative traits.
The nematode Caenorhabditis elegans is central to research in molecular, cell, and developmental biology, but nearly all of this research has been conducted on a single strain. Comparatively little is known about the population genomic and evolutionary history of this species. We characterized C. elegans genetic variation by high-throughput selective sequencing of a worldwide collection of 200 wild strains, identifying 41,188 single nucleotide polymorphisms. Unexpectedly, C. elegans genome variation is dominated by a set of commonly shared haplotypes on four of the six chromosomes, each spanning many megabases. Population-genetic modeling shows that this pattern was generated by chromosome-scale selective sweeps that have reduced variation worldwide; at least one of these sweeps likely occurred in the past few hundred years. These sweeps, which we hypothesize to be a result of human activity, have dramatically reshaped the global C. elegans population in the recent past.
Heritable variation in gene expression forms a crucial bridge between genomic variation and the biology of many traits. However, most expression quantitative trait loci (eQTLs) remain unidentified. We mapped eQTLs by transcriptome sequencing in 1012 yeast segregants. The resulting eQTLs accounted for over 70% of the heritability of mRNA levels, allowing comprehensive dissection of regulatory variation. Most genes had multiple eQTLs. Most expression variation arose from trans-acting eQTLs distant from their target genes. Nearly all trans-eQTLs clustered at 102 hotspot locations, some of which influenced the expression of thousands of genes. Fine-mapped hotspot regions were enriched for transcription factor genes. While most genes had a local eQTL, most of these had no detectable effects on the expression of other genes in trans. Hundreds of non-additive genetic interactions accounted for small fractions of expression variation. These results reveal the complexity of genetic influences on transcriptome variation in unprecedented depth and detail.
The mechanistic basis for how genetic variants cause differences in phenotypic traits is often elusive. We identified a quantitative trait locus in Caenorhabditis elegans that affects three seemingly unrelated phenotypic traits: lifetime fecundity, adult body size, and susceptibility to the human pathogen Staphyloccus aureus. We found a QTL for all three traits arises from variation in the neuropeptide receptor gene npr-1. Moreover, we found that variation in npr-1 is also responsible for differences in 247 gene expression traits. Variation in npr-1 is known to determine whether animals disperse throughout a bacterial lawn or aggregate at the edges of the lawn. We found that the allele that leads to aggregation is associated with reduced growth and reproductive output. The altered gene expression pattern caused by this allele suggests that the aggregation behavior might cause a weak starvation state, which is known to reduce growth rate and fecundity. Importantly, we show that variation in npr-1 causes each of these phenotypic differences through behavioral avoidance of ambient oxygen concentrations. These results suggest that variation in npr-1 has broad pleiotropic effects mediated by altered exposure to bacterial food.
Genetic mapping studies of quantitative traits typically focus on detecting loci that contribute additively to trait variation. Genetic interactions are often proposed as a contributing factor to trait variation, but the relative contribution of interactions to trait variation is a subject of debate. Here, we use a very large cross between two yeast strains to accurately estimate the fraction of phenotypic variance due to pairwise QTL-QTL interactions for 20 quantitative traits. We find that this fraction is 9% on average, substantially less than the contribution of additive QTL (43%). Statistically significant QTL-QTL pairs typically have small individual effect sizes, but collectively explain 40% of the pairwise interaction variance. We show that pairwise interaction variance is largely explained by pairs of loci at least one of which has a significant additive effect. These results refine our understanding of the genetic architecture of quantitative traits and help guide future mapping studies.
The genetic variants underlying complex traits are often elusive even in powerful model organisms such as Caenorhabditis elegans with controlled genetic backgrounds and environmental conditions. Two major contributing factors are: (1) the lack of statistical power from measuring the phenotypes of small numbers of individuals, and (2) the use of phenotyping platforms that do not scale to hundreds of individuals and are prone to noisy measurements. Here, we generated a new resource of 359 recombinant inbred strains that augments the existing C. elegans N2xCB4856 recombinant inbred advanced intercross line population. This new strain collection removes variation in the neuropeptide receptor gene npr-1, known to have large physiological and behavioral effects on C. elegans and mitigates the hybrid strain incompatibility caused by zeel-1 and peel-1, allowing for identification of quantitative trait loci that otherwise would have been masked by those effects. Additionally, we optimized highly scalable and accurate high-throughput assays of fecundity and body size using the COPAS BIOSORT large particle nematode sorter. Using these assays, we identified quantitative trait loci involved in fecundity and growth under normal growth conditions and after exposure to the herbicide paraquat, including independent genetic loci that regulate different stages of larval growth. Our results offer a powerful platform for the discovery of the genetic variants that control differences in responses to drugs, other aqueous compounds, bacterial foods, and pathogenic stresses.
Variation among individuals arises in part from differences in DNA sequences, but the genetic basis for variation in most traits, including common diseases, remains only partly understood. Many DNA variants influence phenotypes by altering the expression level of one or multiple genes. The effects of such variants can be detected as expression quantitative trait loci (eQTL) 1. Traditional eQTL mapping requires large-scale genotype and gene expression data for each individual in the study sample, which limits sample sizes to hundreds of individuals in both humans and model organisms and reduces statistical power 2–6. Consequently, many eQTL are likely missed, especially those with smaller effects 7. Further, most studies use mRNA rather than protein abundance as the measure of gene expression. Studies that have used mass-spectrometry proteomics 8–13 reported surprising differences between eQTL and protein QTL (pQTL) for the same genes 9,10, but these studies have been even more limited in scope. Here, we introduce a powerful method for identifying genetic loci that influence protein expression in the yeast Saccharomyes cerevisiae. We measure single-cell protein abundance through the use of green-fluorescent-protein tags in very large populations of genetically variable cells, and use pooled sequencing to compare allele frequencies across the genome in thousands of individuals with high vs. low protein abundance. We applied this method to 160 genes and detected many more loci per gene than previous studies. We also observed closer correspondence between loci that influence protein abundance and loci that influence mRNA abundance of a given gene. Most loci cluster at hotspot locations that influence multiple proteins—in some cases, more than half of those examined. The variants that underlie these hotspots have profound effects on the gene regulatory network and provide insights into genetic variation in cell physiology between yeast strains.
Apple (Malus X. domestica Borkh.) is one of the world's most valuable fruit crops. Its large size and long juvenile phase make it a particularly promising candidate for marker-assisted selection (MAS). However, advances in MAS in apple have been limited by a lack of phenotype and genotype data from sufficiently large samples. To establish genotype-phenotype relationships and advance MAS in apple, we extracted over 24,000 phenotype scores from the USDA-Germplasm Resources Information Network (GRIN) database and linked them with over 8000 single nucleotide polymorphisms (SNPs) from 689 apple accessions from the USDA apple germplasm collection clonally preserved in Geneva, NY. We find significant genetic differentiation between Old World and New World cultivars and demonstrate that the genetic structure of the domesticated apple also reflects the time required for ripening. A genome-wide association study (GWAS) of 36 phenotypes confirms the association between fruit color and the MYB1 locus, and we also report a novel association between the transcription factor, NAC18.1, and harvest date and fruit firmness. We demonstrate that harvest time and fruit size can be predicted with relatively high accuracies (r > 0.46) using genomic prediction. Rapid decay of linkage disequilibrium (LD) in apples means millions of SNPs may be required for well-powered GWAS. However, rapid LD decay also promises to enable extremely high resolution mapping of causal variants, which holds great potential for advancing MAS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.