The efficacy and mechanisms of therapeutic action are largely described by atomic bonds and interactions local to drug binding sites. Here we introduce global connectivity analysis as a high-throughput computational assay of therapeutic action – inspired by the Google page rank algorithm that unearths most “globally connected” websites from the information-dense world wide web (WWW). We execute short timescale (30 ps) molecular dynamics simulations with high sampling frequency (0.01 ps), to identify amino acid residue hubs whose global connectivity dynamics are characteristic of the ligand or mutation associated with the target protein. We find that unexpected allosteric hubs – up to 20Å from the ATP binding site, but within 5Å of the phosphorylation site – encode the Gibbs free energy of inhibition (ΔGinhibition) for select protein kinase-targeted cancer therapeutics. We further find that clinically relevant somatic cancer mutations implicated in both drug resistance and personalized drug sensitivity can be predicted in a high-throughput fashion. Our results establish global connectivity analysis as a potent assay of protein functional modulation. This sets the stage for unearthing disease-causal exome mutations and motivates forecast of clinical drug response on a patient-by-patient basis. We suggest incorporation of structure-guided genetic inference assays into pharmaceutical and healthcare Oncology workflows.
actual: 152 words)The recent explosion of biomedical knowledge presents both a major opportunity and challenge for scientists tackling complex problems in healthcare. Here we present an approach for synthesizing biomedical knowledge based on a combination of word-embeddings and select cooccurrences. We evaluated our ability to recapitulate and retrospectively predict disease-gene associations from the Online Mendelian Inheritance in Man (OMIM) resource. Our metrics achieved an area under the curve (AUC) value of 0.981 at the recapitulation task for 2,400 disease-gene associations. At the most stringent cutoff, our metrics predicted 13.89% of these associations before their first cooccurrence in the literature, with a median time of 4 years between prediction and first cooccurrence. Finally, our literature metrics can be combined with human genetics data to retrospectively predict disease-gene associations, IL-6 and Giant Cell Arteritis provided as an example. We believe this framework can provide robust biomedical hypotheses at a much faster pace than current standard practices.
Molecular mimicry of host proteins is an evolutionary strategy adopted by viruses to evade immune surveillance and exploit host cell systems. We report that SARS-CoV-2 has evolved a unique S1/S2 cleavage site (RRARSVAS), absent in any previous coronavirus sequenced, that results in mimicry of an identical FURIN-cleavable peptide on the human epithelial sodium channel α-subunit (ENaC-α). Genetic truncation at this ENaCα cleavage site causes aldosterone dysregulation in patients, highlighting the functional importance of the mimicked SARS-CoV-2 peptide. Single cell RNA-seq from 65 studies shows significant overlap between the expression of ENaC-α and ACE2, the putative receptor for the virus, in cell types linked to the cardiovascular-renal-pulmonary pathophysiology of COVID-19. Triangulating this cellular fingerprint with amino acid cleavage signatures of 178 human proteases shows the potential for tissue-specific proteolytic degeneracy wired into the SARS-CoV-2 lifecycle. We extrapolate that the evolution of SARS-CoV-2 into a global coronavirus pandemic may be in part due to its targeted mimicry of human ENaC and hijack of the associated host proteolytic network.
Purpose: TNF-related apoptosis inducing ligand (TRAIL) expression by immune cells contributes to antitumor immunity. A naturally occurring splice variant of TRAIL, called TRAILshort, antagonizes TRAIL-dependent cell killing. It is unknown whether tumor cells express TRAILshort and if it impacts antitumor immunity.Experimental Design: We used an unbiased informatics approach to identify TRAILshort expression in primary human cancers, and validated those results with IHC and ISH. TRAILshortspecific mAbs were used to determine the effect of TRAILshort on tumor cell sensitivity to TRAIL, and to immune effector cell dependent killing of autologous primary tumors.Results: As many as 40% of primary human tumors express TRAILshort by both RNA sequencing and IHC analysis. By ISH, TRAILshort expression is present in tumor cells and not bystander cells. TRAILshort inhibition enhances cancer cell lines sensitivity to TRAIL-dependent killing both in vitro and in immunodeficient xenograft mouse models. Immune effector cells isolated from patients with B-cell malignancies killed more autologous tumor cells in the presence compared with the absence of TRAILshort antibody (P < 0.05).Conclusions: These results identify TRAILshort in primary human malignancies, and suggest that TRAILshort blockade can augment the effector function of autologous immune effector cells.
Current unbiased approaches to mine the large amounts of patient-level data on mutations, structural variations and gene expression result in an unwieldy amount of interactions and correlations, which cannot be parsed to identify disease drivers. Here we present an approach to encode mutational and structural variant data at a patient level in a semantic association space. This approach transforms the presence of a mutation (or other feature) in each patient into the quantitative semantic association score of the corresponding gene and the phenotype of interest, which we have trained on all publicly available literature using word-embedding neural networks. Using data from The Cancer Genome Atlas (TCGA), we encoded the mutation or structural variant status (incl. copy number, fusion and chromothripsis) of all patients in the Lung Adenocarcinoma and Mesothelioma cohorts into our semantic space. For each cancer, we first defined the set of genes that are most associated to it according to the literature. To project each patient into this semantic space, we next determined if each patient had a mutation in the genes representing the disease semantic vector (e.g. NSCLC). For TCGA data we only counted non-Silent mutations and represented them as a binary number for each gene, i.e. 0 if the patient had no mutations in that gene and 1 if the patient had a non-Silent mutation in the gene. Each patient was then encoded in a binary vector with each member corresponding to a gene from the disease semantic vector. For example, lung adenocarcinoma was associated to 1,367 genes in our semantic space. A lung adenocarcinoma patient’s vector would them be composed of 1,367 binary numbers dictating if the gene is mutated or not in that patient. We then multiply these binary vectors with the semantic disease vector to obtain the patient’s projection in the disease space, which in effect replaces the binary number with the Semantic Association Score between the gene and the disease. Contrary to clustering patient samples by their mutation or structural variant data alone, our projected patient vectors clustered patients together into 22 groups with high patient-to-patient similarity. These clusters recapitulate canonical knowledge about the disease, e.g. Lung Adenocarcinoma patients form clusters that include EGFR-driven and KRAS-driven cohorts. We also see novel groups of patients driven by genes such as MET, STK11 and MALAT1. These clusters can be further stratified by their survival status and other clinical features. We validated our approach with a non-TCGA Mesothelioma cohort, revealing similarities in patient stratification regardless of the data source. This approach represents a dramatic shift in patient segmentation, delivering real-time grouping of patients and biomarker identification, which can accelerate clinical trial design and therapeutic development strategy. Citation Format: Enrique Garcia-Rivera, Aaron S. Mansfield, Karthik Murugadoss, Murali Aravamudan. Patient segmentation using machine-learning based literature and genomic data synthesis uncovers novel cohorts of NSCLC and mesothelioma patients [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 2452.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.