Extensive studies are currently being performed to associate disease susceptibility with one form of genetic variation, namely single nucleotide polymorphisms (SNPs). In recent years another type of common genetic variation has been characterised, namely structural variation, including copy number variations (CNVs). To determine the overall contribution of CNVs to complex phenotypes we have performed association analyses of expression levels of 14,925 transcripts with SNPs and CNVs in individuals who are part of the International HapMap project. SNPs and CNVs captured 83.6% and 17.7% of the total detected genetic variation in gene expression, respectively, but the signals from the two types of variation had little overlap. Interrogation of the genome for both types of variants may be an effective way to elucidate the causes of complex phenotypes and disease in humans.Understanding the genetic basis of phenotypic variation in human populations is currently one of the major goals in human genetics. Gene expression (the transcription of DNA into messenger RNA) has been interrogated in a variety of species and experimental scenarios to investigate the genetic basis of variation in gene regulation (1)(2)(3)(4)(5)(6)(7)(8), and to tease apart regulatory networks (9, 10). In some respects, a comprehensive survey of gene expression * Correspondence should be addressed to: Emmanouil T. Dermitzakis (md4@sanger.ac.uk; +44-1223-494866) or Matthew E. Hurles (meh@sanger.ac.uk; +44-1223-495377) (26) and www.sanger.ac.uk/humgen/cnv/data). Log 2 ratios from two sets of clones were analyzed: the whole set of 24,963 autosomal clones (CGH-clones) and the 1322 autosomal clones corresponding to CNVs present in at least two HapMap individuals (CNV clones) (26). We excluded genes on sex chromosomes due to their imbalance in males and females. We performed linear regression (on each of the 4 populations separately) between normalized quantitative gene expression values and SNP genotypes or clone log 2 ratios that were near the gene (SNP position or clone midpoint within 1 Mb and 2Mb, respectively, of the probe midpoint position). We used different window sizes for SNPs and clones because clones are large (median size of ∼170 Kb) and structural variants can exert long-range effects (21), so a 2 Mb window is more appropriate. Statistical significance was evaluated through the use of permutations (27), as previously described (1), and a corrected p-value threshold of 0.001 applied (see Methods). Repeated permutation exercises showed that our permutation thresholds were very stable (see Supplementary Table 4). We test a large number of genes so an additional correction is required. This can either be done by adjusting the threshold to a new corrected threshold above which all genes are expected to be significant (e.g. Bonferoni correction) or by setting the threshold to a value that generates a satisfactory false discovery rate (FDR). We have used the second and we have estimated the FDR based on the number of genes tested and E...
Genetic variation influences gene expression, and this can be efficiently mapped to specific genomic regions and variants. We used gene expression profiling of EBV-transformed lymphoblastoid cell lines of all 270 individuals of the HapMap consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find gene expression levels to be heritable and differentiation between populations in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency HapMap) with gene expression identified at least 1348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis-signals and 15% of trans-signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans-effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. Finally, we explore a variety of methodologies that improve the current state of analysis of gene expression variation.Understanding the molecular basis of human phenotypic variation is a key goal of human genetics, encompassing disease susceptibility, variable response to drugs and ultimately treatment and public health. Over the past decades studies have described and analyzed the genetic basis of human phenotypic variation ranging from whole organism phenotypes such as height 1, to molecular level phenotypes such as lipid levels 2,3. Previous studies have also investigated the effects of nucleotide variation in specific genes or genomic regions on complex and monogenic diseases. Recently, there has been an explosion of genome-wide studies examining the genetic basis of complex diseases by exploring the effects of genetic variation such as single nucleotide polymorphisms (SNPs) 4-7 and copy number variants (CNVs) 8-10 some of which are clearly in non-coding regions of the genome 4-7,11. Technological advances have now made genome-wide association studies a reasonable and affordable approach to the study of complex phenotypes 12.* Correspondence should be addressed to: Emmanouil T. Dermitzakis (md4@sanger.ac.uk; +44-1223-494866), Panagiotis Deloukas (panos@sanger.ac.uk; +44-1223-494909), Barbara E. Stranger (bes@sanger.ac.uk; +44-1223-834244), Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridge, UK. We estimated the median and variance of each of the 47,294 probe types for each population, and analyzed the distribution of variance and median values of normalized values by Gene Ontology (GO) categories 26 after summarizing them in GO-slim categories 27. Specific GO-slim categories such as "chaperone regulatory activity" showed an excess of high variance of gene expression, while genes with extracellular function showed low levels of variation. "Chaperone regulatory activity" genes and "translational regulatory activity" g...
Studies correlating genetic variation to gene expression facilitate the interpretation of common human phenotypes and disease. As functional variants may be operating in a tissue-dependent manner, we performed gene expression profiling and association with genetic variants (SNPs) on three cell types of 75 individuals. We detected cell type-specific genetic effects, with 69 - 80% of regulatory variants operating in a cell type-specific manner and identified multiple eQTLs per gene, unique or shared among cell types and positively correlated with the number of transcripts per gene. Cell type specific eQTLs were found at larger distances from genes and lower effect size similar to known enhancers. These data suggest that the complete regulatory variant repertoire can only be uncovered in the context of cell type specificity.
The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.
The recent success of genome-wide association studies (GWAS) is now followed by the challenge to determine how the reported susceptibility variants mediate complex traits and diseases. Expression quantitative trait loci (eQTLs) have been implicated in disease associations through overlaps between eQTLs and GWAS signals. However, the abundance of eQTLs and the strong correlation structure (LD) in the genome make it likely that some of these overlaps are coincidental and not driven by the same functional variants. In the present study, we propose an empirical methodology, which we call Regulatory Trait Concordance (RTC) that accounts for local LD structure and integrates eQTLs and GWAS results in order to reveal the subset of association signals that are due to cis eQTLs. We simulate genomic regions of various LD patterns with both a single or two causal variants and show that our score outperforms SNP correlation metrics, be they statistical (r2) or historical (D'). Following the observation of a significant abundance of regulatory signals among currently published GWAS loci, we apply our method with the goal to prioritize relevant genes for each of the respective complex traits. We detect several potential disease-causing regulatory effects, with a strong enrichment for immunity-related conditions, consistent with the nature of the cell line tested (LCLs). Furthermore, we present an extension of the method in trans, where interrogating the whole genome for downstream effects of the disease variant can be informative regarding its unknown primary biological effect. We conclude that integrating cellular phenotype associations with organismal complex traits will facilitate the biological interpretation of the genetic effects on these traits.
Summary: Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols.Availability: http://www.sanger.ac.uk/resources/software/genevarContact: emmanouil.dermitzakis@unige.ch
SummaryBackgroundOsteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people. The health economic burden of osteoarthritis is increasing commensurate with obesity prevalence and longevity. Osteoarthritis has a strong genetic component but the success of previous genetic studies has been restricted due to insufficient sample sizes and phenotype heterogeneity.MethodsWe undertook a large genome-wide association study (GWAS) in 7410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11 009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7473 cases and 42 938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent.FindingsWe identified five genome-wide significant loci (binomial test p≤5·0×10−8) for association with osteoarthritis and three loci just below this threshold. The strongest association was on chromosome 3 with rs6976 (odds ratio 1·12 [95% CI 1·08–1·16]; p=7·24×10−11), which is in perfect linkage disequilibrium with rs11177. This SNP encodes a missense polymorphism within the nucleostemin-encoding gene GNL3. Levels of nucleostemin were raised in chondrocytes from patients with osteoarthritis in functional studies. Other significant loci were on chromosome 9 close to ASTN2, chromosome 6 between FILIP1 and SENP6, chromosome 12 close to KLHDC5 and PTHLH, and in another region of chromosome 12 close to CHST11. One of the signals close to genome-wide significance was within the FTO gene, which is involved in regulation of bodyweight—a strong risk factor for osteoarthritis. All risk variants were common in frequency and exerted small effects.InterpretationOur findings provide insight into the genetics of arthritis and identify new pathways that might be amenable to future therapeutic intervention.FundingarcOGEN was funded by a special purpose grant from Arthritis Research UK.
Fast-evolving non-coding sequences Over 1,300 conserved non-coding sequences were identified that appear to have undergone dramatic human-specific changes in selective pressures; these are enriched in recent segmental duplications, suggesting a recent change in selective constraint following duplication.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.