More than ever, COVID-19 is putting pressure on health systems worldwide, especially in Brazil. In this study, we propose a method based on statistics and machine learning that uses blood lab exam data from patients to predict whether patients will require special care (hospitalization in regular or special-care units). We also predict the number of days the patients will stay under such care. The two-step procedure developed uses Bayesian Optimisation to select the best model among several candidates. This leads us to final models that achieve 0.94 area under ROC curve performance for the first target and 1.87 root mean squared error for the second target (which is a 77% improvement over the mean baseline)—making our model ready to be deployed as a decision system that could be available for everyone interested. The analytical approach can be used in other diseases and can help to plan hospital resources in other contexts.
Blood tests have an essential part in everyday medicine and are used by doctors in several diagnostic procedures. Still, this data is multivariate – and often some diseases, like COVID-19, could have different symptom manifestation and outcomes. This study proposes a method of extracting useful information from blood tests using UMAP technique - Uniform Manifold Approximation and Projection for Dimension Reduction combined with DBSCAN clustering and statistical approaches. The analysis performed here indicates several clusters of infection prevalence varying between 2%–37%, meaning that our procedure is indeed capable of finding different patterns. A possible explanation is that COVID-19 is not just a respiratory infection but a systemic disease with critical hematological implications, primarily on white-cell fractions, as indicated by relevant statistical tests p-values in the range of 0.03–0.1. The novel analysis procedure proposed could be adopted in other data-sets of different illnesses to help researchers to discover new patterns of data that could be used in various diseases and contexts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.