2015
DOI: 10.3233/thc-140887
|View full text |Cite
|
Sign up to set email alerts
|

An efficient data preprocessing approach for large scale medical data mining

Abstract: BACKGROUND: The size of medical datasets is usually very large, which directly affects the computational cost of the data mining process. Instance selection is a data preprocessing step in the knowledge discovery process, which can be employed to reduce storage requirements while also maintaining the mining quality. This process aims to filter out outliers (or noisy data) from a given (training) dataset. However, when the dataset is very large in size, more time is required to accomplish the instance selection… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 14 publications
0
10
0
Order By: Relevance
“…Data preprocessing can also be employed for the purpose of reducing storage requirements and maintaining mining quality [97] . The authors [97] applied 3 instance selection algorithms for data preprocessing including genetic algorithms and evaluated using ML models: CART decision trees, K-NN, and SVM. Irrelevant or redundant features in health care data seriously affect subsequent model training and classification accuracy.…”
Section: Implementation Challenges In Smart Healthcare Monitoring Fra...mentioning
confidence: 99%
“…Data preprocessing can also be employed for the purpose of reducing storage requirements and maintaining mining quality [97] . The authors [97] applied 3 instance selection algorithms for data preprocessing including genetic algorithms and evaluated using ML models: CART decision trees, K-NN, and SVM. Irrelevant or redundant features in health care data seriously affect subsequent model training and classification accuracy.…”
Section: Implementation Challenges In Smart Healthcare Monitoring Fra...mentioning
confidence: 99%
“…It is also an inevitable demand for the global medical and health industry information application in clinics. 47…”
Section: Research Backgroundmentioning
confidence: 99%
“…14 In addition, the information inputs of EMR recordings can also take large time consumption, which affects the communications between doctors and patients for the detail information of patients and slows down the treatment process to some extent. 4–6…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Successful applications in drug design, discovery and development can be achieved only when effective computational methods and tools are provided with accurate and reliable pre-processed data [ 8 , 9 ]. Hereafter, big data and artificial intelligence (AI) approaches to data pre-processing [ 10 ], modeling [ 11 , 12 ] and representative applications in drug design and discovery will be introduced.…”
Section: Introductionmentioning
confidence: 99%