2019
DOI: 10.17993/3ctecno.2019.specialissue2.206-221
|View full text |Cite
|
Sign up to set email alerts
|

Data Preprocessing: A preliminary step for web data mining

Abstract: In recent years immense growth of data i.e. big data is observed resulting in a brighter and more optimized future. Big Data demands large computational infrastructure with high-performance processing capabilities. Preparing big data for mining and analysis is a challenging task and requires data to be preprocessed to improve the quality of raw data. The data instance representation and quality are foremost. Data preprocessing is preliminary data mining practice in which raw data is transformed into a format s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(13 citation statements)
references
References 6 publications
(6 reference statements)
0
3
0
Order By: Relevance
“…Data preparation is one of the most critical phases in any methodological approach based on data mining. During this phase, the quality of the data is improved by cleaning, integrating, scaling, reducing dimensionality, transforming, and selecting relevant features from raw data, enhancing machine learning algorithms’ performance [ 25 , 26 ]. Data preparation can be seen as a phase that transforms real-world raw inconsistent and incomplete data into an understandable format.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…Data preparation is one of the most critical phases in any methodological approach based on data mining. During this phase, the quality of the data is improved by cleaning, integrating, scaling, reducing dimensionality, transforming, and selecting relevant features from raw data, enhancing machine learning algorithms’ performance [ 25 , 26 ]. Data preparation can be seen as a phase that transforms real-world raw inconsistent and incomplete data into an understandable format.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…The data preprocessing is the most critical step in analytics as almost 70% of the analysis time is consumed in preparation and preprocessing of the data [56,57]. This stage has been further divided into the following data preprocessing steps.…”
Section: Data Quality Evaluationmentioning
confidence: 99%
“…Once transformed, data is key to more agile and assertive decisions, a remarkable feature that denotes the ability to analyse and implement real-time changes required by the Fourth Industrial Revolution [8] [12]. In this context, this new scenario requires higher quality and reliability of the information, where data is evaluated according to integrity, consistency, credibility, accuracy and clarity in its parameters, so that any analysis can discover models and patterns, relevant characteristics or trends in organisational historical records, the "Log of events" [13].…”
Section: Industry 40 and Data Gathering And Analysismentioning
confidence: 99%