2023
DOI: 10.3390/technologies11040101
|View full text |Cite
|
Sign up to set email alerts
|

Cleaning Big Data Streams: A Systematic Literature Review

Abstract: In today’s big data era, cleaning big data streams has become a challenging task because of the different formats of big data and the massive amount of big data which is being generated. Many studies have proposed different techniques to overcome these challenges, such as cleaning big data in real time. This systematic literature review presents recently developed techniques that have been used for the cleaning process and for each data cleaning issue. Following the PRISMA framework, four databases are searche… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 86 publications
(104 reference statements)
0
3
0
Order By: Relevance
“…Ref. [7] explores the broader context of cleaning big data streams, emphasizing the challenges posed by continuous data generation and the limitations of traditional data cleaning methods. They highlight the importance of addressing common issues in data cleaning, including missing values, duplicated data, outliers, and irrelevant data, within the context of big data streams.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…Ref. [7] explores the broader context of cleaning big data streams, emphasizing the challenges posed by continuous data generation and the limitations of traditional data cleaning methods. They highlight the importance of addressing common issues in data cleaning, including missing values, duplicated data, outliers, and irrelevant data, within the context of big data streams.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Ref. [7] emphasizes the importance of real-time data cleaning in managing large volumes of data, particularly in high-mix, low-volume production environments. Their insights address challenges in data analysis, aligning with the precision needed for production planning amidst structural changes and data quality issues.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Methodologies utilized to assess the efficacy of these treatments were also discovered [49]. Missing values, duplicated data, outliers, and irrelevant data were listed as cleaning concerns that may arise throughout the cleaning process.…”
Section: Cleaning Datamentioning
confidence: 99%