Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data 2014
DOI: 10.1145/2588555.2610494
|View full text |Cite
|
Sign up to set email alerts
|

Towards dependable data repairing with fixing rules

Abstract: One of the main challenges that data cleaning systems face is to automatically identify and repair data errors in a dependable manner. Though data dependencies (a.k.a. integrity constraints) have been widely studied to capture errors in data, automated and dependable data repairing on these errors has remained a notoriously hard problem. In this work, we introduce an automated approach for dependably repairing data errors, based on a novel class of fixing rules. A fixing rule contains an evidence pattern, a se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 92 publications
(29 citation statements)
references
References 23 publications
(48 reference statements)
0
28
0
Order By: Relevance
“…• Rule-based detection algorithms [1,6,15,16,25,33] that can be embedded into frameworks, such as Nadeef [8,23], where a rule can vary from a simple "not null" constraint to multi-attribute functional dependencies (FDs) to user-defined functions. Using this class of tools, a user can specify a collection of rules that clean data will obey, and the tool will find any violations.…”
Section: The Current Statementioning
confidence: 99%
“…• Rule-based detection algorithms [1,6,15,16,25,33] that can be embedded into frameworks, such as Nadeef [8,23], where a rule can vary from a simple "not null" constraint to multi-attribute functional dependencies (FDs) to user-defined functions. Using this class of tools, a user can specify a collection of rules that clean data will obey, and the tool will find any violations.…”
Section: The Current Statementioning
confidence: 99%
“…Master data management systems are being developed by IBM, SAP, Microsoft and Oracle. In fact, master data has been used to clean relational data using editing rules [9], fixing rules [21] or Sherlock rules [14]. New matching rules across RDF data and relational master data (or trusted knowledge bases) need to be defined, such that similar techniques can be applied to clean RDF data.…”
Section: Some Possible Directionsmentioning
confidence: 99%
“…The NADEEF [15] system enables the users to declare a set of such rules via an API and uses advanced algorithms to more efficiently process those rules. The Fixing Rules proposed in [16] allow the users to provide evidence patterns, instead of deterministic rules, to avoid erroneous fixes. Here, an erroneous fix is one that "corrects" values in the data to satisfy the rules, but ends up with the whole tuple being semantically incorrect.…”
Section: Data Cleaning Based On Data Integritymentioning
confidence: 99%