2015 31st IEEE International Conference on Data Engineering Workshops 2015
DOI: 10.1109/icdew.2015.7129549
|View full text |Cite
|
Sign up to set email alerts
|

Big RDF data cleaning

Abstract: Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this pape… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 20 publications
(15 reference statements)
0
2
0
Order By: Relevance
“…The result of data cleaning is to process various dirty data in a corresponding manner, and obtain standard, clean, continuous data for using in data analysis, such as data statistics, data mining, and so on. Data cleaning is divided into supervised cleaning and unsupervised cleaning [32][33][34]. Supervised cleaning process refers to collecting analytical data under the guidance of domain experts, manually removing obvious noise data and repeating records, filling in missing values and other cleaning actions.…”
Section: Data Cleaning Module Designmentioning
confidence: 99%
“…The result of data cleaning is to process various dirty data in a corresponding manner, and obtain standard, clean, continuous data for using in data analysis, such as data statistics, data mining, and so on. Data cleaning is divided into supervised cleaning and unsupervised cleaning [32][33][34]. Supervised cleaning process refers to collecting analytical data under the guidance of domain experts, manually removing obvious noise data and repeating records, filling in missing values and other cleaning actions.…”
Section: Data Cleaning Module Designmentioning
confidence: 99%
“…This is primarily as a result of automatic extraction and conversion of data into RDF format; it is also a function of the fact that, in the semantic web context, the same data are contributed by different sources with different understandings. A line of research focuses on data quality and cleaning of LOD data and RDF data in general [5,6].…”
Section: Introductionmentioning
confidence: 99%