The resources generated by the GTEx consortium oer unprecedented opportunities to advance our understanding of the biology of human traits and diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genetic loci discovered by genome-wide association studies (GWAS). Across a broad set of complex traits and diseases, we nd widespread dosedependent eects of RNA expression and splicing, with higher impact on molecular phenotypes translating into higher impact downstream. Using colocalization and association approaches that take into account the observed allelic heterogeneity, we propose potential target genes for 47% (2,519 out of 5,385) of the GWAS loci examined. Our results demonstrate the translational relevance of the GTEx resources and highlight the need to increase their resolution and breadth to further our understanding of the genotypephenotype link.
Harmonized GWAS and QTL datasetsThe nal GTEx data release (v8) includes 54 primary human tissues, 49 of which included at least 65 samples and were used for cis-QTL mapping ( Fig. 1) (9). This phase increases the number of available tissues relative to previous GTEx publications (v6p; 44 tissues) (8) and doubles the sample size from 7,051 RNA-Seq samples from 449 individuals to 15,253 samples from 838 individuals, now all with whole genome sequencing data as opposed to genotype imputation in v6p. Furthermore, the v8 core data resources now include splicing QTLs (9), allowing parallel analysis of both expression and splicing variation underlying complex traits. Using these resources, we investigated the contribution of expression and splicing QTLs in cis (eQTL and sQTL, respectively) to complex trait variance and etiology.We retained 87 GWAS datasets representing 74 distinct complex traits for further analyses (table S1 and g. S1) after stringent quality control (g. S2; (21)) and data harmonization(g. S3, g. S4).
6We found a signicantly higher correlation in mediating eect between primary and secondary eQTLs for a given gene compared to a null distribution obtained by sampling GWAS eect sizes from a bivariate normal distribution to account for the small observed LD between primary and secondary eQTLs ( Fig. 2D-E) while keeping the observed eQTL eect sizes (p < 1 × 10 −30 ).Interestingly, the correlation between primary and secondary eQTLs for non-colocalized genes (rcp < 0.01), which were used as controls (9, 21), was signicantly higher than this more accurate null, indicating that even eQTLs with very low colocalization probability include many genes that are likely causal. Given this concordance between multiple independent eQTLs, it is clear that with widespread allelic heterogeneity detected with currently available sample sizes, methods that assume single causal variants are highly limited. The approaches described here enable insights into how multiple regulatory effects converge to mediate the same trait association. 7 * alphabetic order