A Risk Prediction Model for Invasive Fungal Disease in Critically Ill Patients in the Intensive Care Unit

Li, Fangyi; Zhou, Mi; Zou, Zhong‐Mei; Li, Weichao; Huang, Canxia; He, Zhijie

doi:10.1016/j.anr.2018.11.004

Cited by 8 publications

(6 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These results may indicate that the differences in sampling strategies among countries and studies could explain a large portion of the observed variability in HAI prevalence. Therefore, the crude risk estimates extracted by prevalence studies should be built on larger samples and enriched by more sensitive analysis, like individual risk analysis based on patient characteristics [ 39 , 40 ] or multilevel models [ 41 – 43 ] to factor out the hospitals’ specific contributions to risk.…”

Section: Discussionmentioning

confidence: 99%

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies

D’Ambrosio

Garlasco

Quattrocolo

et al. 2021

BMC Med Res Methodol

View full text Add to dashboard Cite

Background Healthcare-associated infections (HAIs) represent a major Public Health issue. Hospital-based prevalence studies are a common tool of HAI surveillance, but data quality problems and non-representativeness can undermine their reliability. Methods This study proposes three algorithms that, given a convenience sample and variables relevant for the outcome of the study, select a subsample with specific distributional characteristics, boosting either representativeness (Probability and Distance procedures) or risk factors’ balance (Uniformity procedure). A “Quality Score” (QS) was also developed to grade sampled units according to data completeness and reliability. The methodologies were evaluated through bootstrapping on a convenience sample of 135 hospitals collected during the 2016 Italian Point Prevalence Survey (PPS) on HAIs. Results The QS highlighted wide variations in data quality among hospitals (median QS 52.9 points, range 7.98–628, lower meaning better quality), with most problems ascribable to ward and hospital-related data reporting. Both Distance and Probability procedures produced subsamples with lower distributional bias (Log-likelihood score increased from 7.3 to 29 points). The Uniformity procedure increased the homogeneity of the sample characteristics (e.g., − 58.4% in geographical variability). The procedures selected hospitals with higher data quality, especially the Probability procedure (lower QS in 100% of bootstrap simulations). The Distance procedure produced lower HAI prevalence estimates (6.98% compared to 7.44% in the convenience sample), more in line with the European median. Conclusions The QS and the subsampling procedures proposed in this study could represent effective tools to improve the quality of prevalence studies, decreasing the biases that can arise due to non-probabilistic sample collection.

show abstract

Section: Discussionmentioning

confidence: 99%

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies

D’Ambrosio

Garlasco

Quattrocolo

et al. 2021

BMC Med Res Methodol

View full text Add to dashboard Cite

show abstract

“…These results may indicate that the differences in sampling strategies among countries and studies could explain a large portion of the observed variability in HAI prevalence. Therefore, the crude risk estimates extracted by prevalence studies should be built on larger samples and enriched by more sensitive analysis, like individual risk analysis based on patient characteristics [37], [38] or multilevel models [39]- [41] to factor out the hospitals' specific contributions to risk.…”

Section: Discussionmentioning

confidence: 99%

Data Quality Assessment and Subsampling Strategies to Correct Distributional Bias in Prevalence Studies.

D’Ambrosio

Garlasco

Quattrocolo

et al. 2020

Preprint

View full text Add to dashboard Cite

BackgroundHealthcare-associated infections (HAIs) represent a major Public Health issue. Hospital-based prevalence studies are a common tool of HAI surveillance, but data quality problems and non-representativeness can undermine their reliability.MethodsThis study proposes three algorithms that, given a convenience sample and variables relevant for the outcome of the study, select a subsample with specific distributional characteristics, boosting either representativeness (Probability and Distance procedures) or risk factors' balance (Uniformity procedure). A "Quality Score" (QS) was also developed to grade sampled units according to data completeness and reliability. The methodologies were evaluated through bootstrapping on a convenience sample of 135 hospitals collected during the 2016 Italian Point Prevalence Survey (PPS) on HAIs.ResultsThe QS highlighted wide variations in data quality among hospitals (median QS 52.9 points, range 7.98-628, lower meaning better quality), with most problems ascribable to ward and hospital-related data reporting. Both Distance and Probability procedures produced subsamples with lower distributional bias (Log-likelihood score increased from 7.3 to 29 points). The Uniformity procedure increased the homogeneity of the sample characteristics (e.g., -58.4% in geographical variability). The procedures selected hospitals with higher data quality, especially the Probability procedure (lower QS in 100% of bootstrap simulations). The Distance procedure produced lower HAI prevalence estimates (6.98% compared to 7.44% in the convenience sample), more in line with the European median.ConclusionsThe QS and the subsampling procedures proposed in this study could represent effective tools to improve the quality of prevalence studies, decreasing the biases that can arise due to non-probabilistic sample collection.

show abstract

“…Compared with conventional logistic regression, ML can effectively handle complex linear and nonlinear relationships among variables in large datasets, resulting in superior predictive performance (14)(15)(16)(17). While several studies have investigated the risk factors and prediction models for IFI (18)(19)(20)(21)(22)(23)(24)(25)(26), existing models often suffer from limitations such as small sample sizes, focusing on a single fungal infection, or utilizing features that are difficult to obtain in clinical practice. In addition, given the subdued incidence rate of IFI, datasets commonly display an imbalance.…”

Section: Introductionmentioning

confidence: 99%

Interpretable machine learning for predicting risk of invasive fungal infection in critically ill patients in the intensive care unit: A retrospective cohort study based on MIMIC-IV database

Cao,

Li,

Wang

et al. 2024

Shock

View full text Add to dashboard Cite

The delayed diagnosis of invasive fungal infection (IFI) is highly correlated with poor prognosis in patients. Early identification of high-risk patients with invasive fungal infections and timely implementation of targeted measures is beneficial for patients. The objective of this study was to develop a machine learning-based predictive model for invasive fungal infection in patients during their Intensive Care Unit (ICU) stay. Retrospective data was extracted from adult patients in the MIMIC-IV database who spent a minimum of 48 hours in the ICU. Feature selection was performed using LASSO regression, and the dataset was balanced using the BL-SMOTE approach. Predictive models were built using six machine learning algorithms. The Shapley additive explanation (SHAP) algorithm was employed to assess the impact of various clinical features in the optimal model, enhancing interpretability. The study included 26,346 ICU patients, of whom 379 (1.44%) were diagnosed with invasive fungal infection. The predictive model was developed using 20 risk factors, and the dataset was balanced using the borderline-SMOTE (BL-SMOTE) algorithm. The BL-SMOTE random forest model demonstrated the highest predictive performance (AUC 0.88, 95% CI: 0.84-0.91). SHAP analysis revealed that the three most influential clinical features in the BL-SMOTE random forest model were dialysis treatment, APSIII scores, and liver disease. The machine learning model provides a reliable tool for predicting the occurrence of IFI in ICU patients. The BL-SMOTE random forest model, based on 20 risk factors, exhibited superior predictive performance and can assist clinicians in early assessment of IFI occurrence in ICU patients. Importance Invasive fungal infections are characterized by high incidence and high mortality rates characteristics. In this study, we developed a clinical prediction model for invasive fungal infections in critically ill patients based on machine learning algorithms. The results show that the machine learning model based on 20 clinical features has good predictive value.

show abstract

A Risk Prediction Model for Invasive Fungal Disease in Critically Ill Patients in the Intensive Care Unit

Cited by 8 publications

References 22 publications

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies

Data Quality Assessment and Subsampling Strategies to Correct Distributional Bias in Prevalence Studies.

Interpretable machine learning for predicting risk of invasive fungal infection in critically ill patients in the intensive care unit: A retrospective cohort study based on MIMIC-IV database

Contact Info

Product

Resources

About