Yan Li scite author profile

ObjectiveTo assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions.DesignLongitudinal cohort study from 1 January 1998 to 31 December 2018.Setting and participants3.6 million patients from the Clinical Practice Research Datalink registered at 391 general practices in England with linked hospital admission and mortality records.Main outcome measuresModel performance including discrimination, calibration, and consistency of individual risk prediction for the same patients among models with comparable model performance. 19 different prediction techniques were applied, including 12 families of machine learning models (grid searched for best models), three Cox proportional hazards models (local fitted, QRISK3, and Framingham), three parametric survival models, and one logistic model.ResultsThe various models had similar population level performance (C statistics of about 0.87 and similar calibration). However, the predictions for individual risks of cardiovascular disease varied widely between and within different types of machine learning and statistical models, especially in patients with higher risks. A patient with a risk of 9.5-10.5% predicted by QRISK3 had a risk of 2.9-9.2% in a random forest and 2.4-7.2% in a neural network. The differences in predicted risks between QRISK3 and a neural network ranged between –23.2% and 0.1% (95% range). Models that ignored censoring (that is, assumed censored patients to be event free) substantially underestimated risk of cardiovascular disease. Of the 223 815 patients with a cardiovascular disease risk above 7.5% with QRISK3, 57.8% would be reclassified below 7.5% when using another model.ConclusionsA variety of models predicted risks for the same patients very differently despite similar model performances. The logistic models and commonly used machine learning models should not be directly applied to the prediction of long term risks without considering censoring. Survival models that consider censoring and that are explainable, such as QRISK3, are preferable. The level of consistency within and between models should be routinely assessed before they are used for clinical decision making.

show abstract

Assessment of toxic interactions of heavy metals in multi-component mixtures using sea urchin embryo-larval bioassay

Xue

Wang

et al. 2011

Toxicology in Vitro

View full text Add to dashboard Cite

Large-scale Direct Targeting for Drug Repositioning and Discovery

Zheng

Guo

Huang

et al. 2015

Sci Rep

View full text Add to dashboard Cite

A system-level identification of drug-target direct interactions is vital to drug repositioning and discovery. However, the biological means on a large scale remains challenging and expensive even nowadays. The available computational models mainly focus on predicting indirect interactions or direct interactions on a small scale. To address these problems, in this work, a novel algorithm termed weighted ensemble similarity (WES) has been developed to identify drug direct targets based on a large-scale of 98,327 drug-target relationships. WES includes: (1) identifying the key ligand structural features that are highly-related to the pharmacological properties in a framework of ensemble; (2) determining a drug’s affiliation of a target by evaluation of the overall similarity (ensemble) rather than a single ligand judgment; and (3) integrating the standardized ensemble similarities (Z score) by Bayesian network and multi-variate kernel approach to make predictions. All these lead WES to predict drug direct targets with external and experimental test accuracies of 70% and 71%, respectively. This shows that the WES method provides a potential in silico model for drug repositioning and discovery.

show abstract

A Systems Biology Approach to Uncovering Pharmacological Synergy in Herbal Medicines with Applications to Cardiovascular Disease

Wang

Xue

Tao

et al. 2012

Evidence-Based Complementary and Alternative Medicine

View full text Add to dashboard Cite

Background. Clinical trials reveal that multiherb prescriptions of herbal medicine often exhibit pharmacological and therapeutic superiority in comparison to isolated single constituents. However, the synergistic mechanisms underlying this remain elusive. To address this question, a novel systems biology model integrating oral bioavailability and drug-likeness screening, target identification, and network pharmacology method has been constructed and applied to four clinically widely used herbs Radix Astragali Mongolici, Radix Puerariae Lobatae, Radix Ophiopogonis Japonici, and Radix Salviae Miltiorrhiza which exert synergistic effects of combined treatment of cardiovascular disease (CVD). Results. The results show that the structural properties of molecules in four herbs have substantial differences, and each herb can interact with significant target proteins related to CVD. Moreover, the bioactive ingredients from different herbs potentially act on the same molecular target (multiple-drug-one-target) and/or the functionally diverse targets but with potentially clinically relevant associations (multiple-drug-multiple-target-one-disease). From a molecular/systematic level, this explains why the herbs within a concoction could mutually enhance pharmacological synergy on a disease. Conclusions. The present work provides a new strategy not only for the understanding of pharmacological synergy in herbal medicine, but also for the rational discovery of potent drug/herb combinations that are individually subtherapeutic.

show abstract

Insights on Structural Characteristics and Ligand Binding Mechanisms of CDK2

Zhang

Gao

et al. 2015

IJMS

View full text Add to dashboard Cite

Cyclin-dependent kinase 2 (CDK2) is a crucial regulator of the eukaryotic cell cycle. However it is well established that monomeric CDK2 lacks regulatory activity, which needs to be aroused by its positive regulators, cyclins E and A, or be phosphorylated on the catalytic segment. Interestingly, these activation steps bring some dynamic changes on the 3D-structure of the kinase, especially the activation segment. Until now, in the monomeric CDK2 structure, three binding sites have been reported, including the adenosine triphosphate (ATP) binding site (Site I) and two non-competitive binding sites (Site II and III). In addition, when the kinase is subjected to the cyclin binding process, the resulting structural changes give rise to a variation of the ATP binding site, thus generating an allosteric binding site (Site IV). All the four sites are demonstrated as being targeted by corresponding inhibitors, as is illustrated by the allosteric binding one which is targeted by inhibitor ANS (fluorophore 8-anilino-1-naphthalene sulfonate). In the present work, the binding mechanisms and their fluctuations during the activation process attract our attention. Therefore, we carry out corresponding studies on the structural characterization of CDK2, which are expected to facilitate the understanding of the molecular mechanisms of kinase proteins. Besides, the binding mechanisms of CDK2 with its relevant inhibitors, as well as the changes of binding mechanisms following conformational variations of CDK2, are summarized and compared. The summary of the conformational characteristics and ligand binding mechanisms of CDK2 in the present work will improve our understanding of the molecular mechanisms regulating the bioactivities of CDK2.

show abstract

Do population-level risk prediction models that use routinely collected health data reliably predict individual risks?

Sperrin

Belmonte

et al. 2019

Sci Rep

View full text Add to dashboard Cite

The objective of this study was to assess the reliability of individual risk predictions based on routinely collected data considering the heterogeneity between clinical sites in data and populations. Cardiovascular disease (CVD) risk prediction with QRISK3 was used as exemplar. The study included 3.6 million patients in 392 sites from the Clinical Practice Research Datalink. Cox models with QRISK3 predictors and a frailty (random effect) term for each site were used to incorporate unmeasured site variability. There was considerable variation in data recording between general practices (missingness of body mass index ranged from 18.7% to 60.1%). Incidence rates varied considerably between practices (from 0.4 to 1.3 CVD events per 100 patient-years). Individual CVD risk predictions with the random effect model were inconsistent with the QRISK3 predictions. For patients with QRISK3 predicted risk of 10%, the 95% range of predicted risks were between 7.2% and 13.7% with the random effects model. Random variability only explained a small part of this. The random effects model was equivalent to QRISK3 for discrimination and calibration. Risk prediction models based on routinely collected health data perform well for populations but with great uncertainty for individuals. Clinicians and patients need to understand this uncertainty.

show abstract

An in silico approach for screening flavonoids as P-glycoprotein inhibitors based on a Bayesian-regularized neural network

Wang

Yang

et al. 2005

J Comput Aided Mol Des

View full text Add to dashboard Cite

P-glycoprotein (P-gp), an ATP-binding cassette (ABC) transporter, functions as a biological barrier by extruding cytotoxic agents out of cells, resulting in an obstacle in chemotherapeutic treatment of cancer. In order to aid in the development of potential P-gp inhibitors, we constructed a quantitative structure-activity relationship (QSAR) model of flavonoids as P-gp inhibitors based on Bayesian-regularized neural network (BRNN). A dataset of 57 flavonoids collected from a literature binding to the C-terminal nucleotide-binding domain of mouse P-gp was compiled. The predictive ability of the model was assessed using a test set that was independent of the training set, which showed a standard error of prediction of 0.146+/-0.006 (data scaled from 0 to 1). Meanwhile, two other mathematical tools, back-propagation neural network (BPNN) and partial least squares (PLS) were also attempted to build QSAR models. The BRNN provided slightly better results for the test set compared to BPNN, but the difference was not significant according to F-statistic at p=0.05. The PLS failed to build a reliable model in the present study. Our study indicates that the BRNN-based in silico model has good potential in facilitating the prediction of P-gp flavonoid inhibitors and might be applied in further drug design.

show abstract

MiR-206-mediated dynamic mechanism of the mammalian circadian clock

Wei

Wang

et al. 2011

BMC Syst Biol

View full text Add to dashboard Cite

BackgroundAs a group of highly conserved small non-coding RNAs with a length of 21~23 nucleotides, microRNAs (miRNAs) regulate the gene expression post-transcriptionally by base pairing with the partial or full complementary sequences in target mRNAs, thus resulting in the repression of mRNA translation and the acceleration of mRNA degradation. Recent work has revealed that miRNAs are essential for the development and functioning of the skeletal muscles where they are. In particular, miR-206 has not only been identified as the only miRNA expressed in skeletal muscles, but also exhibited crucial roles in regulation of the muscle development. Although miRNAs are known to regulate various biological processes ranging from development to cancer, much less is known about their role in the dynamic regulation of the mammalian circadian clock.ResultsA detailed dynamic model of miR-206-mediated mammalian circadian clock system was developed presently by using Hill-type terms, Michaelis-Menten type and mass action kinetics. Based on a system-theoretic approach, the model accurately predicts both the periodicity and the entrainment of the circadian clock. It also explores the dynamics properties of the oscillations mediated by miR-206 by means of sensitivity analysis and alterations of parameters. Our results show that miR-206 is an important regulator of the circadian clock in skeletal muscle, and thus by study of miR-206 the main features of its mediation on the clock may be captured. Simulations of these processes display that the amplitude and frequency of the oscillation can be significantly altered through the miR-206-mediated control.ConclusionsMiR-206 has a profound effect on the dynamic mechanism of the mammalian circadian clock, both by control of the amplitude and control or alteration of the frequency to affect the level of the gene expression and to interfere with the temporal sequence of the gene production or delivery. This undoubtedly uncovers a new mechanism for regulation of the circadian clock at a post-transcriptional level and provides important insights into the normal development as well as the pathological conditions of skeletal muscles, such as the aging, chronic disease and cancer.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yan Li

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

Assessment of toxic interactions of heavy metals in multi-component mixtures using sea urchin embryo-larval bioassay

Large-scale Direct Targeting for Drug Repositioning and Discovery

A Systems Biology Approach to Uncovering Pharmacological Synergy in Herbal Medicines with Applications to Cardiovascular Disease

Insights on Structural Characteristics and Ligand Binding Mechanisms of CDK2

Do population-level risk prediction models that use routinely collected health data reliably predict individual risks?

An in silico approach for screening flavonoids as P-glycoprotein inhibitors based on a Bayesian-regularized neural network

MiR-206-mediated dynamic mechanism of the mammalian circadian clock

Contact Info

Product

Resources

About