Objectives
To systematically review studies using machine learning (ML) algorithms to predict whether patients undergoing total knee or total hip arthroplasty achieve an improvement as high or higher than the minimal clinically important differences (MCID) in patient reported outcome measures (PROMs) (classification problem).
Methods
Studies were eligible to be included in the review if they collected PROMs both pre- and postintervention, reported the method of MCID calculation and applied ML. ML was defined as a family of models which automatically learn from data when selecting features, identifying nonlinear relations or interactions. Predictive performance must have been assessed using common metrics. Studies were searched on MEDLINE, PubMed Central, Web of Science Core Collection, Google Scholar and Cochrane Library. Study selection and risk of bias assessment (ROB) was conducted by two independent researchers.
Results
517 studies were eligible for title and abstract screening. After screening title and abstract, 18 studies qualified for full-text screening. Finally, six studies were included. The most commonly applied ML algorithms were random forest and gradient boosting. Overall, eleven different ML algorithms have been applied in all papers. All studies reported at least fair predictive performance, with two reporting excellent performance. Sample size varied widely across studies, with 587 to 34,110 individuals observed. PROMs also varied widely across studies, with sixteen applied to TKA and six applied to THA. There was no single PROM utilized commonly in all studies. All studies calculated MCIDs for PROMs based on anchor-based or distribution-based methods or referred to literature which did so. Five studies reported variable importance for their models. Two studies were at high risk of bias.
Discussion
No ML model was identified to perform best at the problem stated, nor can any PROM said to be best predictable. Reporting standards must be improved to reduce risk of bias and improve comparability to other studies.
Aim
To estimate the cost‐effectiveness of an intervention facilitating the early detection of adverse drug events through the means of health professional training and the application of a digital screening tool.
Design
Multi‐centred non‐randomized controlled trial from August 2018 to March 2020 including 65 nursing homes or home care providers.
Methods
We aim to estimate the effect of the intervention on the rate of adverse drug events as primary outcome through a quasi‐experimental empirical study design. As secondary outcomes, we use hospital admissions and falls. All outcomes will be measured on patient‐month level. Once the causal effect of the intervention is estimated, cost‐effectiveness will be calculated. For cost‐effectiveness, we include all patient costs observed by the German statutory health insurance.
Results
The results of this study will inform about the cost‐effectiveness of the optimized drug supply intervention and provide evidence for potential reimbursement within the German statutory health insurance system.
Our aim was to predict future high-cost patients with machine learning using healthcare claims data. We applied a random forest (RF), a gradient boosting machine (GBM), an artificial neural network (ANN) and a logistic regression (LR) to predict high-cost patients in the following year. Therefore, we exploited routinely collected sickness funds claims and cost data of the years 2016, 2017 and 2018. Various specifications of each algorithm were trained and cross-validated on training data (n = 20,984) with claims and cost data from 2016 and outcomes from 2017. The best performing specifications of each algorithm were selected based on validation dataset performance. For performance comparison, selected models were applied to unforeseen data with features of the year 2017 and outcomes of the year 2018 (n = 21,146). The RF was the best performing algorithm measured by the area under the receiver operating curve (AUC) with a value of 0.883 (95% confidence interval (CI): 0.872–0.893) on test data, followed by the GBM (AUC = 0.878; 95% CI: 0.867–0.889). The ANN (AUC = 0.846; 95% CI: 0.834–0.857) and LR (AUC = 0.839; 95% CI: 0.826–0.852) were significantly outperformed by the GBM and the RF. All ML algorithms and the LR performed ´good´ (i.e. 0.9 > AUC ≥ 0.8). We were able to develop machine learning models that predict high-cost patients with ‘good’ performance facilitating routinely collected sickness fund claims and cost data. We found that tree-based models performed best and outperformed the ANN and LR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.