2021
DOI: 10.1007/s12652-021-03590-2
|View full text |Cite
|
Sign up to set email alerts
|

Data cleansing mechanisms and approaches for big data analytics: a systematic study

Abstract: With the evolution of new technologies, the production of digital data is constantly growing. It is thus necessary to develop data management strategies in order to handle the large-scale datasets. The data gathered through different sources, such as sensor networks, social media, business transactions, etc. is inherently uncertain due to noise, missing values, inconsistencies and other problems that impact the quality of big data analytics. One of the key challenges in this context is to detect and repair dir… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 38 publications
(30 reference statements)
0
3
0
Order By: Relevance
“…Our approach therefore combines a sample-based, an expert-based, and a rule-based approach. This way, the best possible data preparation quality can be achieved [79].…”
Section: Data Preparation Phasementioning
confidence: 99%
“…Our approach therefore combines a sample-based, an expert-based, and a rule-based approach. This way, the best possible data preparation quality can be achieved [79].…”
Section: Data Preparation Phasementioning
confidence: 99%
“…The demand for extensive data, often referred to as “data hungriness” ( 62 ), poses medico-legal challenges, as single institutions may lack sufficient data for reliable predictions ( 16 ). Finally, data cleansing can enhance data usability in the context of intelligence, but it must be implemented carefully to avoid introducing another source of errors ( 59 , 63 ). This is the hypothetical responsibility of trainers and programmers, with potential shared responsibility on the part of the purchasing company, considering the limitations in the assessment related to the concepts of bias and error.…”
Section: Medical Malpractice Liability Assessmentmentioning
confidence: 99%
“…In this dataset we cleanse ip addresses that do not give a ping response (Hosseinzadeh et al, 2021).…”
Section: Cleansingmentioning
confidence: 99%