BackgroundFollowing visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data.MethodsWe demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples (N=683) was randomly split into evaluation (n=456) and validation (n=227) samples.We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment.ResultsThe trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble.ConclusionsWe use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition.Electronic supplementary materialThe online version of this article (10.1186/s12874-019-0681-4) contains supplementary material, which is available to authorized users.
Background The importance of patient-reported outcome measurement in chronic kidney disease (CKD) populations has been established. However, there remains a lack of research that has synthesised data around CKD-specific symptom and health-related quality of life (HRQOL) burden globally, to inform focused measurement of the most relevant patient-important information in a way that minimises patient burden. The aim of this review was to synthesise symptom prevalence/severity and HRQOL data across the following CKD clinical groups globally: (1) stage 1–5 and not on renal replacement therapy (RRT), (2) receiving dialysis, or (3) in receipt of a kidney transplant. Methods and findings MEDLINE, PsycINFO, and CINAHL were searched for English-language cross-sectional/longitudinal studies reporting prevalence and/or severity of symptoms and/or HRQOL in CKD, published between January 2000 and September 2021, including adult patients with CKD, and measuring symptom prevalence/severity and/or HRQOL using a patient-reported outcome measure (PROM). Random effects meta-analyses were used to pool data, stratified by CKD group: not on RRT, receiving dialysis, or in receipt of a kidney transplant. Methodological quality of included studies was assessed using the Joanna Briggs Institute Critical Appraisal Checklist for Studies Reporting Prevalence Data, and an exploration of publication bias performed. The search identified 1,529 studies, of which 449, with 199,147 participants from 62 countries, were included in the analysis. Studies used 67 different symptom and HRQOL outcome measures, which provided data on 68 reported symptoms. Random effects meta-analyses highlighted the considerable symptom and HRQOL burden associated with CKD, with fatigue particularly prevalent, both in patients not on RRT (14 studies, 4,139 participants: 70%, 95% CI 60%–79%) and those receiving dialysis (21 studies, 2,943 participants: 70%, 95% CI 64%–76%). A number of symptoms were significantly (p < 0.05 after adjustment for multiple testing) less prevalent and/or less severe within the post-transplantation population, which may suggest attribution to CKD (fatigue, depression, itching, poor mobility, poor sleep, and dry mouth). Quality of life was commonly lower in patients on dialysis (36-Item Short Form Health Survey [SF-36] Mental Component Summary [MCS] 45.7 [95% CI 45.5–45.8]; SF-36 Physical Component Summary [PCS] 35.5 [95% CI 35.3–35.6]; 91 studies, 32,105 participants for MCS and PCS) than in other CKD populations (patients not on RRT: SF-36 MCS 66.6 [95% CI 66.5–66.6], p = 0.002; PCS 66.3 [95% CI 66.2–66.4], p = 0.002; 39 studies, 24,600 participants; transplant: MCS 50.0 [95% CI 49.9–50.1], p = 0.002; PCS 48.0 [95% CI 47.9–48.1], p = 0.002; 39 studies, 9,664 participants). Limitations of the analysis are the relatively few studies contributing to symptom severity estimates and inconsistent use of PROMs (different measures and time points) across the included literature, which hindered interpretation. Conclusions The main findings highlight the considerable symptom and HRQOL burden associated with CKD. The synthesis provides a detailed overview of the symptom/HRQOL profile across clinical groups, which may support healthcare professionals when discussing, measuring, and managing the potential treatment burden associated with CKD. Protocol registration PROSPERO CRD42020164737.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.