Jiangming Sun scite author profile

Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry-scale data not only for building predictive models of in silico polypharmacology and off-target effects but also for the validation of cheminformatics approaches in general.Electronic supplementary materialThe online version of this article (doi:10.1186/s13321-017-0203-5) contains supplementary material, which is available to authorized users.

show abstract

Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets

Sun¹,

Carlsson²,

Ahlberg³

et al. 2017

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Conformal prediction has been proposed as a more rigorous way to define prediction confidence compared to other application domain concepts that have earlier been used for QSAR modeling. One main advantage of such a method is that it provides a prediction region potentially with multiple predicted labels, which contrasts to the single valued (regression) or single label (classification) output predictions by standard QSAR modeling algorithms. Standard conformal prediction might not be suitable for imbalanced data sets. Therefore, Mondrian cross-conformal prediction (MCCP) which combines the Mondrian inductive conformal prediction with cross-fold calibration sets has been introduced. In this study, the MCCP method was applied to 18 publicly available data sets that have various imbalance levels varying from 1:10 to 1:1000 (ratio of active/inactive compounds). Our results show that MCCP in general performed well on bioactivity data sets with various imbalance levels. More importantly, the method not only provides confidence of prediction and prediction regions compared to standard machine learning methods but also produces valid predictions for the minority class. In addition, a compound similarity based nonconformity measure was investigated. Our results demonstrate that although it gives valid predictions, its efficiency is much worse than that of model dependent metrics.

show abstract

Loss of TFB1M results in mitochondrial dysfunction that leads to impaired insulin secretion and diabetes

Sharoyko

Abels

Sun

et al. 2014

View full text Add to dashboard Cite

We have previously identified transcription factor B1 mitochondrial (TFB1M) as a type 2 diabetes (T2D) risk gene, using human and mouse genetics. To further understand the function of TFB1M and how it is associated with T2D, we created a β-cell-specific knockout of Tfb1m, which gradually developed diabetes. Prior to the onset of diabetes, β-Tfb1m(-/-) mice exhibited retarded glucose clearance owing to impaired insulin secretion. β-Tfb1m(-/-) islets released less insulin in response to fuels, contained less insulin and secretory granules and displayed reduced β-cell mass. Moreover, mitochondria in Tfb1m-deficient β-cells were more abundant with disrupted architecture. TFB1M is known to control mitochondrial protein translation by adenine dimethylation of 12S ribosomal RNA (rRNA). Here, we found that the levels of TFB1M and mitochondrial-encoded proteins, mitochondrial 12S rRNA methylation, ATP production and oxygen consumption were reduced in β-Tfb1m(-/-) islets. Furthermore, the levels of reactive oxygen species (ROS) in response to cellular stress were increased whereas induction of defense mechanisms was attenuated. We also show increased apoptosis and necrosis as well as infiltration of macrophages and CD4(+) cells in the islets. Taken together, our findings demonstrate that Tfb1m-deficiency in β-cells caused mitochondrial dysfunction and subsequently diabetes owing to combined loss of β-cell function and mass. These observations reflect pathogenetic processes in human islets: using RNA sequencing, we found that the TFB1M risk variant exhibited a negative gene-dosage effect on islet TFB1M mRNA levels, as well as insulin secretion. Our findings highlight the role of mitochondrial dysfunction in impairments of β-cell function and mass, the hallmarks of T2D.

show abstract

Glucagon-Like Peptide 1 Stimulates Insulin Secretion via Inhibiting RhoA/ROCK Signaling and Disassembling Glucotoxicity-Induced Stress Fibers

Kong

Yan

Sun

et al. 2014

View full text Add to dashboard Cite

Chronic hyperglycemia leads to pancreatic β-cell dysfunction characterized by diminished glucose-stimulated insulin secretion (GSIS), but the precise cellular processes involved are largely unknown. Here we show that pancreatic β-cells chronically exposed to a high glucose level displayed substantially increased amounts of stress fibers compared with β-cells cultured at a low glucose level. β-Cells at high glucose were refractory to glucose-induced actin cytoskeleton remodeling and insulin secretion. Importantly, F-actin depolymerization by either cytochalasin B or latrunculin B restored glucotoxicity-diminished GSIS. The effects of glucotoxicity on increasing stress fibers and reducing GSIS were reversed by Y-27632, a Rho-associated kinase (ROCK)-specific inhibitor, which caused actin depolymerization and enhanced GSIS. Notably, glucagon-like peptide-1-(7-36) amide (GLP-1), a peptide hormone that stimulates GSIS at both normal and hyperglycemic conditions, also reversed glucotoxicity-induced increase of stress fibers and reduction of GSIS. In addition, GLP-1 inhibited glucotoxicity-induced activation of RhoA/ROCK and thereby resulted in actin depolymerization and potentiation of GSIS. Furthermore, this effect of GLP-1 was mimicked by cAMP-increasing agents forskolin and 3-isobutyl-1-methylxanthine as well as the protein kinase A agonist 6-Bnz-cAMP-AM whereas it was abolished by the protein kinase A inhibitor Rp-Adenosine 3',5'-cyclic monophosphorothioate triethylammonium salt. To establish a clinical relevance of our findings, we examined the association of genetic variants of RhoA/ROCK with metabolic traits in homeostasis model assessment index of insulin resistance. Several single-nucleotide polymorphisms in and around RHOA were associated with elevated fasting insulin and homeostasis model assessment index of insulin resistance, suggesting a possible role in metabolic dysregulation. Collectively these findings unravel a novel mechanism whereby GLP-1 potentiates glucotoxicity-diminished GSIS by depolymerizing F-actin cytoskeleton via protein kinase A-mediated inhibition of the RhoA-ROCK signaling pathway.

show abstract

Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models

Sturm

Sun

Vandriessche³

et al. 2018

J. Chem. Inf. Model.

View full text Add to dashboard Cite

The volume of high throughput screening data has considerably increased since the begin of the automated biochemical and cell-based assays era. This information rich data source provides tremendous repurposing opportunities for data mining. It was recently shown that biochemical or cell-based assay results can be compiled into so called high-throughput fingerprints (HTSFPs) as a new type of descriptor describing molecular bioactivity profiles which can be applied in virtual screening, iterative screening, and target deconvolution. However, so far studies around HTSFPs and machine learning have mainly focused on predicting the outcome of molecules in single highthroughput assays and no one has reported the modelling of compounds' biochemical assay activities towards a panel of target proteins. Therefore, there is a need for a detailed analysis of the performance of predictive models built with HTSFPs with respect to models built with more widely used structural descriptors both in terms of hit identification and of scaffold hopping potentials. In this article, in-house HTSFPs were built and combined with multi-task deep learning and support vector machine methods to build compound activity predictive models. Performances of HTSFP models were compared to the performances of models built with the conventional structural descriptors ECFPs. Moreover, we investigated the effect of high throughput screening false positives and negatives on the performance of deep learning models. Our results showed that the two fingerprints yielded in similar performances and in diverse hits with very little overlap, thus demonstrating the orthogonality of bioactivity profile based descriptors with structural descriptors. Therefore, modelling compound activity data using ECFP together with HTSFPs increases the scaffold potential of the predictive models.

show abstract

Improving the performance of β-turn prediction using predicted shape strings and a two-layer support vector machine model

Tang

Liu

et al. 2011

BMC Bioinformatics

View full text Add to dashboard Cite

Genome-wide association study identifies 16 genomic regions associated with circulating cytokines at birth

et al. 2020

View full text Add to dashboard Cite

Circulating inflammatory markers are essential to human health and disease, and they are often dysregulated or malfunctioning in cancers as well as in cardiovascular, metabolic, immunologic and neuropsychiatric disorders. However, the genetic contribution to the physiological variation of levels of circulating inflammatory markers is largely unknown. Here we report the results of a genome-wide genetic study of blood concentration of ten cytokines, including the hitherto unexplored calcium-binding protein (S100B). The study leverages a unique sample of neonatal blood spots from 9,459 Danish subjects from the iPSYCH initiative. We estimate the SNP-heritability of marker levels as ranging from essentially zero for Erythropoietin (EPO) up to 73% for S100B. We identify and replicate 16 associated genomic regions (p < 5 x 10−9), of which four are novel. We show that the associated variants map to enhancer elements, suggesting a possible transcriptional effect of genomic variants on the cytokine levels. The identification of the genetic architecture underlying the basic levels of cytokines is likely to prompt studies investigating the relationship between cytokines and complex disease. Our results also suggest that the genetic architecture of cytokines is stable from neonatal to adult life.

show abstract

Plaque Vulnerability Index Predicts Cardiovascular Events: A Histological Study of an Endarterectomy Cohort

Gonçalves

Sun

Tengryd

et al. 2021

JAHA

View full text Add to dashboard Cite

Background The balance between stabilizing and destabilizing atherosclerotic plaque components is used in experimental studies and in imaging studies to identify rupture prone plaques. However, we lack the evidence that this balance predicts future cardiovascular events. Here we explore whether a calculated histological ratio, referred to as vulnerability index (VI), can predict patients at higher risk to suffer from future cardiovascular events. Methods and Results Carotid plaques and clinical information from 194 patients were studied. Tissue sections were used for histological analysis to calculate the VI (CD68 [cluster of differentiation 68], alpha‐actin, Oil red O, Movat pentachrome, and glycophorin A). Postoperative cardiovascular events were identified through the Swedish National Inpatient Health Register (2005–2013). During the follow‐up (60 months) 45 postoperative cardiovascular events were registered. Patients with a plaque VI in the fourth quartile compared with the first to third quartiles had significantly higher risk to suffer from a future cardiovascular event ( P =0.0002). The VI was an independent predictor and none of the 5 histological variables analyzed separately predicted events. In the 13 patients who underwent bilateral carotid endarterectomy, the VI of the right plaque correlated with the VI of the left plaque and vice versa ( r =0.7, P =0.01). Conclusions Our findings demonstrate that subjects with a high plaque VI have an increased risk of future cardiovascular events, independently of symptoms and other known cardiovascular risk factors . This strongly supports that techniques which image such plaques can facilitate risk stratification for subjects in need of more intense treatment.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jiangming Sun

ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics

Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets

Loss of TFB1M results in mitochondrial dysfunction that leads to impaired insulin secretion and diabetes

Glucagon-Like Peptide 1 Stimulates Insulin Secretion via Inhibiting RhoA/ROCK Signaling and Disassembling Glucotoxicity-Induced Stress Fibers

Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models

Improving the performance of β-turn prediction using predicted shape strings and a two-layer support vector machine model

Genome-wide association study identifies 16 genomic regions associated with circulating cytokines at birth

Plaque Vulnerability Index Predicts Cardiovascular Events: A Histological Study of an Endarterectomy Cohort

Contact Info

Product

Resources

About