Rationale:
Idiopathic and heritable pulmonary arterial hypertension (PAH) are rare but comprise a genetically heterogeneous patient group. RNA sequencing linked to the underlying genetic architecture can be used to better understand the underlying pathology by identifying key signaling pathways and stratify patients more robustly according to clinical risk.
Objectives:
To use a three-stage design of RNA discovery, RNA validation and model construction, and model validation to define a set of PAH-associated RNAs and a single summarizing RNA model score. To define genes most likely to be involved in disease development, we performed Mendelian randomization (MR) analysis.
Methods:
RNA sequencing was performed on whole-blood samples from 359 patients with idiopathic, heritable, and drug-induced PAH and 72 age- and sex-matched healthy volunteers. The score was evaluated against disease severity markers including survival analysis using all-cause mortality from diagnosis. MR used known expression quantitative trait loci and summary statistics from a PAH genome-wide association study.
Measurements and Main Results:
We identified 507 genes with differential RNA expression in patients with PAH compared with control subjects. A model of 25 RNAs distinguished PAH with 87% accuracy (area under the curve 95% confidence interval: 0.791–0.945) in model validation. The RNA model score was associated with disease severity and long-term survival (
P
= 4.66 × 10
−6
) in PAH. MR detected an association between SMAD5 levels and PAH disease susceptibility (odds ratio, 0.317; 95% confidence interval, 0.129–0.776;
P
= 0.012).
Conclusions:
A whole-blood RNA signature of PAH, which includes RNAs relevant to disease pathogenesis, associates with disease severity and identifies patients with poor clinical outcomes. Genetic variants associated with lower SMAD5 expression may increase susceptibility to PAH.
A goal of genomics is to understand the relationships between biological processes. Pathways contribute to functional interplay within biological processes through complex but poorly understood interactions. However, limited functional references for global pathway relationships exist. Pathways from databases such as KEGG and Reactome provide discrete annotations of biological processes. Their relationships are currently either inferred from gene set enrichment within specific experiments, or by simple overlap, linking pathway annotations that have genes in common. Here, we provide a unifying interpretation of functional interaction between pathways by systematically quantifying coexpression between 1,330 canonical pathways from the Molecular Signatures Database (MSigDB) to establish the Pathway Coexpression Network (PCxN). We estimated the correlation between canonical pathways valid in a broad context using a curated collection of 3,207 microarrays from 72 normal human tissues. PCxN accounts for shared genes between annotations to estimate significant correlations between pathways with related functions rather than with similar annotations. We demonstrate that PCxN provides novel insight into mechanisms of complex diseases using an Alzheimer’s Disease (AD) case study. PCxN retrieved pathways significantly correlated with an expert curated AD gene list. These pathways have known associations with AD and were significantly enriched for genes independently associated with AD. As a further step, we show how PCxN complements the results of gene set enrichment methods by revealing relationships between enriched pathways, and by identifying additional highly correlated pathways. PCxN revealed that correlated pathways from an AD expression profiling study include functional clusters involved in cell adhesion and oxidative stress. PCxN provides expanded connections to pathways from the extracellular matrix. PCxN provides a powerful new framework for interrogation of global pathway relationships. Comprehensive exploration of PCxN can be performed at http://pcxn.org/.
Idiopathic pulmonary arterial hypertension (IPAH) is a rare but fatal disease diagnosed by right heart catheterisation and the exclusion of other forms of pulmonary arterial hypertension, producing a heterogeneous population with varied treatment response. Here we show unsupervised machine learning identification of three major patient subgroups that account for 92% of the cohort, each with unique whole blood transcriptomic and clinical feature signatures. These subgroups are associated with poor, moderate, and good prognosis. The poor prognosis subgroup is associated with upregulation of the ALAS2 and downregulation of several immunoglobulin genes, while the good prognosis subgroup is defined by upregulation of the bone morphogenetic protein signalling regulator NOG, and the C/C variant of HLA-DPA1/DPB1 (independently associated with survival). These findings independently validated provide evidence for the existence of 3 major subgroups (endophenotypes) within the IPAH classification, could improve risk stratification and provide molecular insights into the pathogenesis of IPAH.
Background: Pulmonary arterial hypertension (PAH) is a rare but life shortening disease, the diagnosis of which is often delayed, and requires an invasive right heart catheterisation. Identifying diagnostic biomarkers may improve screening to identify patients at risk of PAH earlier and provide new insights into disease pathogenesis. MicroRNAs are small, non-coding molecules of RNA, previously shown to be dysregulated in PAH, and contribute to the disease process in animal models. Methods: Plasma from 64 treatment naïve patients with PAH and 43 disease and healthy controls were profiled for microRNA expression by Agilent Microarray. Following quality control and normalisation, the cohort was split into training and validation sets. Four separate machine learning feature selection methods were applied to the training set, along with a univariate analysis. Findings: 20 microRNAs were identified as putative biomarkers by consensus feature selection from all four methods. Two microRNAs (miR-636 and miR-187-5p) were selected by all methods and used to predict PAH diagnosis with high accuracy. Integrating microRNA expression profiles with their associated target mRNA revealed 61 differentially expressed genes verified in two independent, publicly available PAH lung tissue data sets. Two of seven potentially novel gene targets were validated as differentially expressed in vitro in human pulmonary artery smooth muscle cells. Interpretation: This consensus of multiple machine learning approaches identified two miRNAs that were able to distinguish PAH from both disease and healthy controls. These circulating miRNA, and their target genes may provide insight into PAH pathogenesis and reveal novel regulators of disease and putative drug targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.