2017
DOI: 10.1109/jbhi.2016.2525731
|View full text |Cite
|
Sign up to set email alerts
|

Can Cluster-Boosted Regression Improve Prediction of Death and Length of Stay in the ICU?

Abstract: Sharing of personal health information is subject to multiple constraints, which may dissuade some organizations from sharing their data. Summarized deidentified data, such as that derived from k-means cluster analysis, is subject to far fewer privacy-related constraints. In this paper, we examine the extent to which analysis of clustered patient types can match predictions made by analyzing the entire dataset at once. After reviewing relevant literature, and explaining how data are summarized in each cluster … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
1
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 29 publications
(18 citation statements)
references
References 17 publications
0
14
1
1
Order By: Relevance
“…A list of potential predictors, and how the predictors were coded can be found in Appendix 1. For continuous variables (WBC count, hemoglobin, and age), we used k means clustering to choose cut-points (see Appendix 1) in order to improve accuracy and decrease over-fitting [24,25]. The outcomes were coded as binary outcomes.…”
Section: Methodsmentioning
confidence: 99%
“…A list of potential predictors, and how the predictors were coded can be found in Appendix 1. For continuous variables (WBC count, hemoglobin, and age), we used k means clustering to choose cut-points (see Appendix 1) in order to improve accuracy and decrease over-fitting [24,25]. The outcomes were coded as binary outcomes.…”
Section: Methodsmentioning
confidence: 99%
“…After each patient in the test set was assigned to a cluster, the associated regression functions obtained from the train set were used to predict the death status for a particular patient. We conducted the experiments with four different datasets and compared our results with the baseline results from previous work [4], which only considered numerical features. The first configuration was the existing data combined with the selected data of DESCRIPTION.…”
Section: F Evaluation Setting and Statistical Analysismentioning
confidence: 99%
“…Therefore, it is feasible to share summarized personal health information across different health data repositories of many organizations without privacy violations. Rouzbahman et al [4] performed cluster analysis to partition patients into similar groups. They then predicted patient mortality and length of stay in the ICU using regression prediction for each cluster.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations