2022
DOI: 10.1002/sim.9442
|View full text |Cite
|
Sign up to set email alerts
|

A classification for complex imbalanced data in disease screening and early diagnosis

Abstract: Imbalanced classification has drawn considerable attention in the statistics and machine learning literature. Typically, traditional classification methods often perform poorly when a severely skewed class distribution is observed, not to mention under a high-dimensional longitudinal data structure. Given the ubiquity of big data in modern health research, it is expected that imbalanced classification in disease diagnosis may encounter an additional level of difficulty that is imposed by such a complex data st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 75 publications
0
6
0
Order By: Relevance
“…The model is built using a decision tree generator algorithm [16,21], whose output is easily interpretable and applicable in a multitude of contexts. Data preprocessing: As shown in Table 1, the dataset obtained is characterized by being imbalanced [14]. This fact indicates that the data present an unequal distribution among classes [15], implying that the percentage of observations at each PUs risk level is quite uneven.…”
Section: Methodology For Building the Decision Tree Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…The model is built using a decision tree generator algorithm [16,21], whose output is easily interpretable and applicable in a multitude of contexts. Data preprocessing: As shown in Table 1, the dataset obtained is characterized by being imbalanced [14]. This fact indicates that the data present an unequal distribution among classes [15], implying that the percentage of observations at each PUs risk level is quite uneven.…”
Section: Methodology For Building the Decision Tree Modelsmentioning
confidence: 99%
“…This, in turn, will facilitate effective interventions, including the selection of support surfaces and other relevant applications. For this purpose, data from 16,215 patients in Granada (Spain) have been collected, focusing mainly on indicators that are quickly identifiable in practice by nursing professionals, such as mobility, activity and skin humidity [4] The data treated present a common problem in nursing classification datasets, such as the presence of class imbalance [14]. In order address this issue, the dataset is preprocessed and various oversampling configurations are studied to determine the best one allowing to improve data quality before modeling [15].…”
Section: Introductionmentioning
confidence: 99%
“…Data preprocessing. As shown in Table 1, the dataset obtained is characterized by being imbalanced [26]. This fact indicates that the data present an unequal distribution among classes [27], implying that the percentage of observations at each PU risk level is quite uneven.…”
mentioning
confidence: 94%
“…For this purpose, data from 16,215 patients in Granada (Spain) have been collected, focusing mainly on indicators that are quickly identifiable in practice by nursing professionals, such as mobility, activity, and skin humidity [4]. The data present a common problem in nursing classification datasets, such as the presence of class imbalance [26]. In order to address this issue, the dataset is preprocessed and various oversampling configurations are studied to determine the best one to enable the improvement of the data quality before modeling [27].…”
Section: Introductionmentioning
confidence: 99%
“…For example, in the medical diagnosis of human cancer disease compared to healthy people belong to the minority, but obviously, people pay more attention to cancer disease. If the patient is diagnosed as healthy, the patient will not receive timely treatment, resulting in irreparable results [7], [8]. Therefore, the analysis of minority data is particularly important.…”
Section: Introductionmentioning
confidence: 99%