A classification for complex imbalanced data in disease screening and early diagnosis

Li, Yiming; Hsu, Wei‐Wen

doi:10.1002/sim.9442

Cited by 7 publications

(6 citation statements)

References 75 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The model is built using a decision tree generator algorithm [16,21], whose output is easily interpretable and applicable in a multitude of contexts. Data preprocessing: As shown in Table 1, the dataset obtained is characterized by being imbalanced [14]. This fact indicates that the data present an unequal distribution among classes [15], implying that the percentage of observations at each PUs risk level is quite uneven.…”

Section: Methodology For Building the Decision Tree Modelsmentioning

confidence: 99%

“…This, in turn, will facilitate effective interventions, including the selection of support surfaces and other relevant applications. For this purpose, data from 16,215 patients in Granada (Spain) have been collected, focusing mainly on indicators that are quickly identifiable in practice by nursing professionals, such as mobility, activity and skin humidity [4] The data treated present a common problem in nursing classification datasets, such as the presence of class imbalance [14]. In order address this issue, the dataset is preprocessed and various oversampling configurations are studied to determine the best one allowing to improve data quality before modeling [15].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Differentiating Pressure Ulcer Risk Levels Using Decision Trees Based on Interpretable Indicators: A Practical Approach for Nursing Care

Vera-Salmerón,

Dominguez-Nogueira,

Sáez

et al. 2024

Preprint

View full text Add to dashboard Cite

Pressure ulcers carry a significant risk in clinical practice and require effective preventive measures. This paper proposes a practical and interpretable approach to estimate the risk levels of pressure ulcers using decision tree models. In order to address the common problem of imbalanced learning in nursing classification datasets, various oversampling configurations are analyzed to improve data quality prior to modeling. The decision trees built are based on three easily identifiable and clinically relevant pressure ulcer risk indicators: mobility, activity and skin moisture. Their analysis allows nursing professionals to predict the risk levels of pressure ulcer and make informed decisions about patient care. Additionally, this research introduces a novel tabular visualization method to enhance the usability of the decision trees in clinical practice. The approach proposed aims to support nursing professionals in making timely decisions regarding the appropriate preventive interventions according to the risk levels of pressure ulcers, thus improving patient outcomes and healthcare costs. The usefulness and effectiveness of the models presented make them a valuable resource for nursing care in the prevention of pressure ulcers.

show abstract

Section: Methodology For Building the Decision Tree Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Differentiating Pressure Ulcer Risk Levels Using Decision Trees Based on Interpretable Indicators: A Practical Approach for Nursing Care

Vera-Salmerón,

Dominguez-Nogueira,

Sáez

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Data preprocessing. As shown in Table 1, the dataset obtained is characterized by being imbalanced [26]. This fact indicates that the data present an unequal distribution among classes [27], implying that the percentage of observations at each PU risk level is quite uneven.…”

mentioning

confidence: 94%

“…For this purpose, data from 16,215 patients in Granada (Spain) have been collected, focusing mainly on indicators that are quickly identifiable in practice by nursing professionals, such as mobility, activity, and skin humidity [4]. The data present a common problem in nursing classification datasets, such as the presence of class imbalance [26]. In order to address this issue, the dataset is preprocessed and various oversampling configurations are studied to determine the best one to enable the improvement of the data quality before modeling [27].…”

Section: Introductionmentioning

confidence: 99%

Differentiating Pressure Ulcer Risk Levels through Interpretable Classification Models Based on Readily Measurable Indicators

Vera-Salmerón,

Domínguez-Nogueira,

Sáez

et al. 2024

Healthcare

View full text Add to dashboard Cite

Pressure ulcers carry a significant risk in clinical practice. This paper proposes a practical and interpretable approach to estimate the risk levels of pressure ulcers using decision tree models. In order to address the common problem of imbalanced learning in nursing classification datasets, various oversampling configurations are analyzed to improve the data quality prior to modeling. The decision trees built are based on three easily identifiable and clinically relevant pressure ulcer risk indicators: mobility, activity, and skin moisture. Additionally, this research introduces a novel tabular visualization method to enhance the usability of the decision trees in clinical practice. Thus, the primary aim of this approach is to provide nursing professionals with valuable insights for assessing the potential risk levels of pressure ulcers, which could support their decision-making and allow, for example, the application of suitable preventive measures tailored to each patient’s requirements. The interpretability of the models proposed and their performance, evaluated through stratified cross-validation, make them a helpful tool for nursing care in estimating the pressure ulcer risk level.

show abstract

“…For example, in the medical diagnosis of human cancer disease compared to healthy people belong to the minority, but obviously, people pay more attention to cancer disease. If the patient is diagnosed as healthy, the patient will not receive timely treatment, resulting in irreparable results [7], [8]. Therefore, the analysis of minority data is particularly important.…”

Section: Introductionmentioning

confidence: 99%

Imbalanced Data Over-Sampling Method Based on ISODATA Clustering

LV,

LIU

2023

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Class imbalance is one of the challenges faced in the field of machine learning. It is difficult for traditional classifiers to predict the minority class data. If the imbalanced data is not processed, the effect of the classifier will be greatly reduced. Aiming at the problem that the traditional classifier tends to the majority class data and ignores the minority class data, imbalanced data over-sampling method based on iterative self-organizing data analysis technique algorithm(ISODATA) clustering is proposed. The minority class is divided into different sub-clusters by ISO-DATA, and each sub-cluster is over-sampled according to the sampling ratio, so that the sampled minority class data also conforms to the imbalance of the original minority class data. The new imbalanced data composed of new minority class data and majority class data is classified by SVM and Random Forest classifier. Experiments on 12 datasets from the KEEL datasets show that the method has better G-means and F-value, improving the classification accuracy.

show abstract

A classification for complex imbalanced data in disease screening and early diagnosis

Cited by 7 publications

References 75 publications

Differentiating Pressure Ulcer Risk Levels Using Decision Trees Based on Interpretable Indicators: A Practical Approach for Nursing Care

Differentiating Pressure Ulcer Risk Levels Using Decision Trees Based on Interpretable Indicators: A Practical Approach for Nursing Care

Differentiating Pressure Ulcer Risk Levels through Interpretable Classification Models Based on Readily Measurable Indicators

Imbalanced Data Over-Sampling Method Based on ISODATA Clustering

Contact Info

Product

Resources

About