An Interpretable Prediction Model for Identifying N7-Methylguanosine Sites Based on XGBoost and SHAP

Bi, Yue; Xiang, Dongxu; Ge, Zongyuan; Li, Fuyi; Jia, Cangzhi; Song, Jiangning

doi:10.1016/j.omtn.2020.08.022

Cited by 105 publications

(68 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It uses a sparsity-aware learning algorithm to process sparse data and weighted quantile sketch to approximate tree learning [41]. Since the decision tree is a simple classifier composed of hierarchically organized dichotomous determinations, its structure also demonstrates good interpretability [48][49][50]. In addition, the model can deal with missing values well.…”

Section: Prediction Modelmentioning

confidence: 99%

Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study

Zhong¹,

Zhuang²,

Dong³

et al. 2021

JMIR Med Inform

View full text Add to dashboard Cite

Background Tuberculosis (TB) is a pandemic, being one of the top 10 causes of death and the main cause of death from a single source of infection. Drug-induced liver injury (DILI) is the most common and serious side effect during the treatment of TB. Objective We aim to predict the status of liver injury in patients with TB at the clinical treatment stage. Methods We designed an interpretable prediction model based on the XGBoost algorithm and identified the most robust and meaningful predictors of the risk of TB-DILI on the basis of clinical data extracted from the Hospital Information System of Shenzhen Nanshan Center for Chronic Disease Control from 2014 to 2019. Results In total, 757 patients were included, and 287 (38%) had developed TB-DILI. Based on values of relative importance and area under the receiver operating characteristic curve, machine learning tools selected patients’ most recent alanine transaminase levels, average rate of change of patients’ last 2 measures of alanine transaminase levels, cumulative dose of pyrazinamide, and cumulative dose of ethambutol as the best predictors for assessing the risk of TB-DILI. In the validation data set, the model had a precision of 90%, recall of 74%, classification accuracy of 76%, and balanced error rate of 77% in predicting cases of TB-DILI. The area under the receiver operating characteristic curve score upon 10-fold cross-validation was 0.912 (95% CI 0.890-0.935). In addition, the model provided warnings of high risk for patients in advance of DILI onset for a median of 15 (IQR 7.3-27.5) days. Conclusions Our model shows high accuracy and interpretability in predicting cases of TB-DILI, which can provide useful information to clinicians to adjust the medication regimen and avoid more serious liver injury in patients.

show abstract

Section: Prediction Modelmentioning

confidence: 99%

Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study

Zhong¹,

Zhuang²,

Dong³

et al. 2021

JMIR Med Inform

View full text Add to dashboard Cite

show abstract

“…It is a powerful tool to peer inside black box models and understand how they arrive at a particular decision. Multiple recent studies used only SHAP values for variable selection 35 – 39 , however, only the variables with the greatest impact as defined by average absolute SHAP value were chosen. This is in contrast to this study, in which it was shown that variables with the greatest contribution are not necessarily robust (Fig.…”

Section: Discussionmentioning

confidence: 99%

A hierarchical expert-guided machine learning framework for clinical decision support systems: an application to traumatic brain injury prognostication

et al. 2021

View full text Add to dashboard Cite

Prognosis of the long-term functional outcome of traumatic brain injury is essential for personalized management of that injury. Nonetheless, accurate prediction remains unavailable. Although machine learning has shown promise in many fields, including medical diagnosis and prognosis, such models are rarely deployed in real-world settings due to a lack of transparency and trustworthiness. To address these drawbacks, we propose a machine learning-based framework that is explainable and aligns with clinical domain knowledge. To build such a framework, additional layers of statistical inference and human expert validation are added to the model, which ensures the predicted risk score’s trustworthiness. Using 831 patients with moderate or severe traumatic brain injury to build a model using the proposed framework, an area under the receiver operating characteristic curve (AUC) and accuracy of 0.8085 and 0.7488 were achieved, respectively, in determining which patients will experience poor functional outcomes. The performance of the machine learning classifier is not adversely affected by the imposition of statistical and domain knowledge “checks and balances”. Finally, through a case study, we demonstrate how the decision made by a model might be biased if it is not audited carefully.

show abstract

“…In the application of biomedicine, Bi et al (2020) [23] developed a new interpretive machine learning approach using the XGBoost algorithm and six different types of sequential encoding schemes to distinguish m7G sites, with cross-validation showing that their approach was more accurate than other models. Mahmud et al (2019) [24] validated the reliability and superiority of the XGBoost classifier for the determination of drug-target interactions (DTI).…”

Section: Extreme Gradient Boosting (Xgboost) Algorithm Applicationsmentioning

confidence: 99%

An Interpretable Aid Decision-Making Model for Flag State Control Ship Detention Based on SMOTE and XGBoost

Hao

Wang

2021

JMSE

View full text Add to dashboard Cite

The reasonable decision of ship detention plays a vital role in flag state control (FSC). Machine learning algorithms can be applied as aid tools for identifying ship detention. In this study, we propose a novel interpretable ship detention decision-making model based on machine learning, termed SMOTE-XGBoost-Ship detention model (SMO-XGB-SD), using the extreme gradient boosting (XGBoost) algorithm and the synthetic minority oversampling technique (SMOTE) algorithm to identify whether a ship should be detained. Our verification results show that the SMO-XGB-SD algorithm outperforms random forest (RF), support vector machine (SVM), and logistic regression (LR) algorithm. In addition, the new algorithm also provides a reasonable interpretation of model performance and highlights the most important features for identifying ship detention using the Shapley additive explanations (SHAP) algorithm. The SMO-XGB-SD model provides an effective basis for aiding decisions on ship detention by inland flag state control officers (FSCOs) and the ship safety management of ship operating companies, as well as training services for new FSCOs in maritime organizations.

show abstract

An Interpretable Prediction Model for Identifying N7-Methylguanosine Sites Based on XGBoost and SHAP

Cited by 105 publications

References 36 publications

Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study

Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study

A hierarchical expert-guided machine learning framework for clinical decision support systems: an application to traumatic brain injury prognostication

An Interpretable Aid Decision-Making Model for Flag State Control Ship Detention Based on SMOTE and XGBoost

Contact Info

Product

Resources

About