Riccardo Doyle scite author profile

Riccardo Doyle

5Publications

10Citation Statements Received

35Citation Statements Given

How they've been cited

How they cite others

Affiliations

Imperial College London, St. James's Hospital

Publications

Order By: Most citations

Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study

Doyle¹

2021

JMIRx Med

View full text Add to dashboard Cite

Background The onset and development of the COVID-19 pandemic have placed pressure on hospital resources and staff worldwide. The integration of more streamlined predictive modeling in prognosis and triage–related decision-making can partly ease this pressure. Objective The objective of this study is to assess the performance impact of dimensionality reduction on COVID-19 mortality prediction models, demonstrating the high impact of a limited number of features to limit the need for complex variable gathering before reaching meaningful risk labelling in clinical settings. Methods Standard machine learning classifiers were employed to predict an outcome of either death or recovery using 25 patient-level variables, spanning symptoms, comorbidities, and demographic information, from a geographically diverse sample representing 17 countries. The effects of feature reduction on the data were tested by running classifiers on a high-quality data set of 212 patients with populated entries for all 25 available features. The full data set was compared to two reduced variations with 7 features and 1 feature, respectively, extracted using univariate mutual information and chi-square testing. Classifier performance on each data set was then assessed on the basis of accuracy, sensitivity, specificity, and received operating characteristic–derived area under the curve metrics to quantify benefit or loss from reduction. Results The performance of the classifiers on the 212-patient sample resulted in strong mortality detection, with the highest performing model achieving specificity of 90.7% (95% CI 89.1%-92.3%) and sensitivity of 92.0% (95% CI 91.0%-92.9%). Dimensionality reduction provided strong benefits for performance. The baseline accuracy of a random forest classifier increased from 89.2% (95% CI 88.0%-90.4%) to 92.5% (95% CI 91.9%-93.0%) when training on 7 chi-square–extracted features and to 90.8% (95% CI 89.8%-91.7%) when training on 7 mutual information–extracted features. Reduction impact on a separate logistic classifier was mixed; however, when present, losses were marginal compared to the extent of feature reduction, altogether showing that reduction either improves performance or can reduce the variable-sourcing burden at hospital admission with little performance loss. Extreme feature reduction to a single most salient feature, often age, demonstrated large standalone explanatory power, with the best-performing model achieving an accuracy of 81.6% (95% CI 81.1%-82.1%); this demonstrates the relatively marginal improvement that additional variables bring to the tested models. Conclusions Predictive statistical models have promising performance in early prediction of death among patients with COVID-19. Strong dimensionality reduction was shown to further improve baseline performance on selected classifiers and only marginally reduce it in others, highlighting the importance of feature reduction in future model construction and the feasibility of deprioritizing large, hard-to-source, and nonessential feature sets in real world settings.

show abstract

Prediction of COVID-19 Mortality to Support Patient Prognosis and Triage and Limits of Current Open-Source Data

Doyle

2021

Preprint

View full text Add to dashboard Cite

This study examines the accuracy and applicability of machine learning methods in early prediction of mortality in COVID-19 patients. Patient symptoms, pre-existing conditions, age and sex were employed as predictive attributes from data spanning 17 countries. Performance on a semi-evenly balanced class sample of 212 patients resulted in high detection accuracy of 92.5%, with strong specificity and sensitivity. Performance on a larger sample of 5,121 patients with only age and mortality information was added as a measure of baseline discriminatory ability. Stratifying - Random Forest - and linear - Logistic Regression - methods were applied, both achieving modestly strong performance, with 77.4%-79.3% sensitivity and 71.4%-72.6% accuracy, highlighting predictive power even on the basis of a single attribute. Mutual information was employed as a dimensionality reduction technique, either greatly improving performance or having negligible impact, showing how a small number of easily retrievable attributes can provide timely and accurate predictions, with applications for datasets with slowly available attributes - such as laboratory results. Limitations of the data were extensively explored and detailed, as each results section outlines a further investigation exploring a facet of its flaws. Future use of this dataset should be cautious and always accompanied by disclaimers on issues of real-life reproducibility. While its open-source nature is a credit to the wider research community and more such datasets should be published, in its current state it is imperfect for most statistical patient-level studies and can produce valid conclusions only for a limited set of applications.

show abstract

Author’s Response to Peer Reviews of “Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study” (Preprint)

Doyle¹

2021

Preprint

View full text Add to dashboard Cite

show abstract

Parallel Conformal Hyperparameter Optimization

Doyle¹

2022

Preprint

View full text Add to dashboard Cite

Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study (Preprint)

Doyle¹

2021

Preprint

View full text Add to dashboard Cite

BACKGROUND The onset and development of the COVID-19 pandemic have placed pressure on hospital resources and staff worldwide. The integration of more streamlined predictive modeling in prognosis and triage–related decision-making can partly ease this pressure. OBJECTIVE The objective of this study is to assess the performance impact of dimensionality reduction on COVID-19 mortality prediction models, demonstrating the high impact of a limited number of features to limit the need for complex variable gathering before reaching meaningful risk labelling in clinical settings. METHODS Standard machine learning classifiers were employed to predict an outcome of either death or recovery using 25 patient-level variables, spanning symptoms, comorbidities, and demographic information, from a geographically diverse sample representing 17 countries. The effects of feature reduction on the data were tested by running classifiers on a high-quality data set of 212 patients with populated entries for all 25 available features. The full data set was compared to two reduced variations with 7 features and 1 feature, respectively, extracted using univariate mutual information and chi-square testing. Classifier performance on each data set was then assessed on the basis of accuracy, sensitivity, specificity, and received operating characteristic–derived area under the curve metrics to quantify benefit or loss from reduction. RESULTS The performance of the classifiers on the 212-patient sample resulted in strong mortality detection, with the highest performing model achieving specificity of 90.7% (95% CI 89.1%-92.3%) and sensitivity of 92.0% (95% CI 91.0%-92.9%). Dimensionality reduction provided strong benefits for performance. The baseline accuracy of a random forest classifier increased from 89.2% (95% CI 88.0%-90.4%) to 92.5% (95% CI 91.9%-93.0%) when training on 7 chi-square–extracted features and to 90.8% (95% CI 89.8%-91.7%) when training on 7 mutual information–extracted features. Reduction impact on a separate logistic classifier was mixed; however, when present, losses were marginal compared to the extent of feature reduction, altogether showing that reduction either improves performance or can reduce the variable-sourcing burden at hospital admission with little performance loss. Extreme feature reduction to a single most salient feature, often age, demonstrated large standalone explanatory power, with the best-performing model achieving an accuracy of 81.6% (95% CI 81.1%-82.1%); this demonstrates the relatively marginal improvement that additional variables bring to the tested models. CONCLUSIONS Predictive statistical models have promising performance in early prediction of death among patients with COVID-19. Strong dimensionality reduction was shown to further improve baseline performance on selected classifiers and only marginally reduce it in others, highlighting the importance of feature reduction in future model construction and the feasibility of deprioritizing large, hard-to-source, and nonessential feature sets in real world settings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Riccardo Doyle

Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study

Prediction of COVID-19 Mortality to Support Patient Prognosis and Triage and Limits of Current Open-Source Data

Author’s Response to Peer Reviews of “Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study” (Preprint)

Parallel Conformal Hyperparameter Optimization

Machine Learning–Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study (Preprint)

Contact Info

Product

Resources

About