Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiologic agent of COVID-19, enters human cells using the angiotensin-converting enzyme 2 (ACE2) protein as a receptor. ACE2 is thus key to the infection and treatment of the coronavirus. ACE2 is highly expressed in the heart, respiratory and gastrointestinal tracts, playing important regulatory roles in the cardiovascular and other biologic systems. However, the genetic basis of the ACE2 protein levels is not well understood. Methods: We conduct so far the largest genome-wide association meta-analysis of plasma ACE2 levels in over 28,000 individuals of the SCALLOP Consortium. We summarize the cross-sectional epidemiologic correlates of circulating ACE2. Using the summary-statistics-based high-definition likelihood method, we estimate relevant genetic correlations with cardiometabolic phenotypes, COVID-19, and other human complex traits and diseases. We perform causal inference of soluble ACE2 on vascular disease outcomes and COVID-19 disease severity using Mendelian randomization. We also perform in silico functional analysis by integrating with other types of omics data. Results: We identified ten loci, including eight novel, capturing 30% of the protein's heritability. We detected that plasma ACE2 was genetically correlated with vascular diseases, severe COVID-19, and a wide range of human complex diseases and medications. An X-chromosome cis-pQTL-based Mendelian randomization analysis suggested a causal effect of elevated ACE2 levels on COVID-19 severity (odds ratio (OR), 1.63; 95% CI, 1.10 to 2.42; P = 0.01), hospitalization (OR, 1.52; 95% CI, 1.05 to 2.21; P = 0.03), and infection (OR, 1.60; 95% CI, 1.08 to 2.37; P = 0.02). Tissue- and cell-type-specific transcriptomic and epigenomic analysis revealed that the ACE2 regulatory variants were enriched for DNA methylation sites in blood immune cells. Conclusions: Human plasma ACE2 shares a genetic basis with cardiovascular disease, COVID-19, and other related diseases. The genetic architecture of the ACE2 protein is mapped, providing a useful resource for further biological and clinical studies on this coronavirus receptor.
Lung-function impairment underlies chronic obstructive pulmonary disease (COPD) and predicts mortality. In the largest multi-ancestry genome-wide association meta-analysis of lung function to date, comprising 580,869 participants, we identified 1,020 independent association signals implicating 559 genes supported by ≥2 criteria from a systematic variant-to-gene mapping framework. These genes were enriched in 29 pathways. Individual variants showed heterogeneity across ancestries, age and smoking groups, and collectively as a genetic risk score showed strong association with COPD across ancestry groups. We undertook phenome-wide association studies for selected associated variants as well as trait and pathway-specific genetic risk scores to infer possible consequences of intervening in pathways underlying lung function. We highlight new putative causal variants, genes, proteins and pathways, including those targeted by existing drugs. These findings bring us closer to understanding the mechanisms underlying lung function and COPD, and should inform functional genomics experiments and potentially future COPD therapies.
The Plant Genome W heat breeding has progressed dramatically in the last century thanks to the combination of various technologies (Poland et al., 2012); taken together these advancements have driven the yearly genetic gain through selective breeding to nearly a linear increase of 1% in the potential grain yield (Bassi et al., 2016). Faced against human population growth and uncertain climates, global wheat production, however, still falls short (Curtis and Halford, 2014), as the global demand for wheat is projected to increase 60% when the population reaches 9.8 billion by 2050 (Alexandratos and Bruinsma, 2012). The emphasis now is increasingly not only meeting the food
Lung function impairment underlies chronic obstructive pulmonary disease (COPD) and predicts mortality. In the largest multi-ancestry GWAS meta-analysis of lung function to date, comprising 580,869 participants, 1020 independent association signals identified 559 genes supported by ≥2 criteria from a systematic variant-to-gene mapping framework. These genes were enriched in 29 pathways. Individual variants showed heterogeneity across ancestries, age and smoking groups, and collectively as a genetic risk score (GRS) showed strong association with COPD across ancestry groups. We undertook phenome-wide association studies (PheWAS) for selected associated variants, and trait and pathway-specific GRS to infer possible consequences of intervening in pathways underlying lung function. We highlight new putative causal variants, genes, proteins and pathways, including those targeted by existing drugs. These findings bring us closer to understanding the mechanisms underlying lung function and COPD, and should inform functional genomics experiments and potentially future COPD therapies.
Privacy protection is a core principle of genomic but not proteomic research. We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS), calculated continuous protein level genotype probabilities, and then applied a naïve Bayesian approach to link SomaScan 1.3K proteomes to genomes for 2812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA). We correctly linked 90–95% of proteomes to their correct genome and for 95–99% we identify the 1% most likely links. The linking accuracy in subjects with African ancestry was lower (~ 60%) unless training included diverse subjects. With larger profiling (SomaScan 5K) in the Atherosclerosis Risk Communities (ARIC) correct identification was > 99% even in mixed ancestry populations. We also linked proteomes-to-proteomes and used the proteome only to determine features such as sex, ancestry, and first-degree relatives. When serial proteomes are available, the linking algorithm can be used to identify and correct mislabeled samples. This work also demonstrates the importance of including diverse populations in omics research and that large proteomic datasets (> 1000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered unidentifiable.
Crucial to variety improvement programs is the reliable and accurate prediction of genotype’s performance across environments. However, due to the impactful presence of genotype by environment (G×E) interaction that dictates how changes in expression and function of genes influence target traits in different environments, prediction performance of genomic selection (GS) using single-environment models often falls short. Furthermore, despite the successes of genome-wide association studies (GWAS), the genetic insights derived from genome-to-phenome mapping have not yet been incorporated in predictive analytics, making GS models that use Gaussian kernel primarily an estimator of genomic similarity, instead of the underlying genetics characteristics of the populations. Here, we developed a GS framework that, in addition to capturing the overall genomic relationship, can capitalize on the signal of genetic associations of the phenotypic variation as well as the genetic characteristics of the populations. The capacity of predicting the performance of populations across environments was demonstrated by an overall gain in predictability up to 31% for the winter wheat DH population. Compared to Gaussian kernels, we showed that our multi-environment weighted kernels could better leverage the significance of genetic associations and yielded a marked improvement of 4–33% in prediction accuracy for half-sib families. Furthermore, the flexibility incorporated in our Bayesian implementation provides the generalizable capacity required for predicting multiple highly genetic heterogeneous populations across environments, allowing reliable GS for genetic improvement programs that have no access to genetically uniform material.
Introduction: Privacy protection is a core principle of genomic research but needs further refinement for high-throughput proteomic platforms. Methods: We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS) and then calculated genotype probabilities by protein level for each protein-genotype combination (training). Using the most significant 100 proteins, we applied a naive Bayesian approach to match proteomes to genomes for 2,812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA) with SomaScan 1.3K proteomes and also 2,646 COPDGene subjects with SomaScan 5K proteomes (testing). We tested whether subtracting mean genotype effect for each pQTL SNP would obscure genetic identity. Results: In the four testing cohorts, we were able to correctly match 90%-95% their proteomes to their correct genome and for 95%-99% we could match the proteome to the 1% most likely genome. With larger profiling (SomaScan 5K), correct identification was > 99%. The accuracy of matching in subjects with African ancestry was lower (~60%) unless training included diverse subjects. Mean genotype effect adjustment reduced identification accuracy nearly to random guess. Conclusion: Large proteomic datasets (> 1,000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered deidentified. These findings suggest that large scale proteomic data be given privacy protections of genomic data, or that bioinformatic transformations (such as adjustment for genotype effect) should be applied to obfuscate identity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.