2020
DOI: 10.1145/3360904
|View full text |Cite
|
Sign up to set email alerts
|

Computing Optimal Repairs for Functional Dependencies

Abstract: We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) updates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(49 citation statements)
references
References 37 publications
(66 reference statements)
0
49
0
Order By: Relevance
“…The problem has been studied extensively in database theory for various classes of constraints Γ. It is NP-hard even when D consists of a single relation (as it does in our paper) and Γ consists of functional dependencies [27]. In our setting, Γ consists of conditional independence statements, and it remains NP-hard, as we show in Sec.…”
Section: Preliminariesmentioning
confidence: 68%
“…The problem has been studied extensively in database theory for various classes of constraints Γ. It is NP-hard even when D consists of a single relation (as it does in our paper) and Γ consists of functional dependencies [27]. In our setting, Γ consists of conditional independence statements, and it remains NP-hard, as we show in Sec.…”
Section: Preliminariesmentioning
confidence: 68%
“…Regarding methods of data repair, previous works have considered two main approaches: (1) repairing attribute values in cells [6,11,29,33,44] and (2) tuple deletion [10,33,34]; our work focuses on the latter. A major advantage of our approach is the ability to perform cascade deletions over multiple relations in the database while following different well-defined semantics (and the admin may choose which one to follow based on the application scenario).…”
Section: Related Workmentioning
confidence: 99%
“…A major advantage of our approach is the ability to perform cascade deletions over multiple relations in the database while following different well-defined semantics (and the admin may choose which one to follow based on the application scenario). Similar to our independent semantics, a common objective for data repairs is to change the database in the minimal way that will make it consistent with the constraints [5,19,33]. In some scenarios a good repair can be obtained by changing values in the database and the metric of minimal changes may not work well [44].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…To capture partial knowledge of the rules and the data, we clean data by providing probabilistic fixes. Then, using our solution once all rules are known and given the probabilistic suggestions, we can either use inference [23,29,36] when master data exist, or have humans fix the errors in the query results. Inference approaches over the probabilistic data are complementary and out of the scope of this work.…”
Section: From Offline To Online Data Cleaningmentioning
confidence: 99%