2019
DOI: 10.1007/s00778-019-00586-5
|View full text |Cite
|
Sign up to set email alerts
|

Cleaning data with Llunatic

Abstract: Data cleaning (or data repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a given set of constraints. In recent years, repairing methods have been proposed for several classes of constraints. These methods, however, tend to hard-code the strategy to repair conflicting values and are specialized toward specific classes of constraints. In this paper, we develop a general chase-based repairing framework, referred to as Llunatic, i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(24 citation statements)
references
References 48 publications
(106 reference statements)
0
23
0
Order By: Relevance
“…This value is below the query-response user-tolerance time threshold for interactive systems [23], which means that it is considered by experts to be acceptable for users querying the systems. Moreover, in additional experiments, in which we used hardcoded rules, rather than the rule interpreter [15], the values of the Algorithm 2 reformulation overhead dropped to tens of ms vs. the time-difference values shown in Fig. 6.…”
Section: Summary Of Experimental Resultsmentioning
confidence: 98%
See 1 more Smart Citation
“…This value is below the query-response user-tolerance time threshold for interactive systems [23], which means that it is considered by experts to be acceptable for users querying the systems. Moreover, in additional experiments, in which we used hardcoded rules, rather than the rule interpreter [15], the values of the Algorithm 2 reformulation overhead dropped to tens of ms vs. the time-difference values shown in Fig. 6.…”
Section: Summary Of Experimental Resultsmentioning
confidence: 98%
“…We have implemented Algorithms 1-2 on top of Java 1.8, using the Llunatic [15] rule interpreter for rewriting and expanding temporally annotated queries into standard executable SPARQL queries. The relational source data were stored using PostgreSQL 11, and storing and manipulating target RDF triples was done with RDF4J 3.0.1.…”
Section: Implementation and Experimental Setupmentioning
confidence: 99%
“…To that extent, some approaches from the field of constraint-based consistency verification follow the same strategy. For example, Llunatic allows to model extended equality-generating dependencies (EGDs) and uses a generalization of the Chase algorithm to construct a Chase tree that can be used to search repairs of dirty data [4,5]. In practical applications, the branching factor of a Chase tree is so high, that pruning strategies are required to keep the computational effort feasible.…”
Section: Related Workmentioning
confidence: 99%
“…All constraints in Figure 1 can be represented in the 9 framework of conditional functional dependencies (CFDs) [1,2] or denial constraints (DCs) [3], so why not use them? The main problem with these formalisms, is that finding (minimal) repairs is a computationally intensive task [4,5,6,7]. When using edit rules, however, the problem of finding (minimal) repairs boils down to finding (minimal) set covers of failing rules if the given rules satisfy a closure property [8,9,10].…”
Section: Introductionmentioning
confidence: 99%
“…In this context, input information of a specific type corresponds to a signal. In the literature, existing methods usually rely on one of several signals: (i) ICs (BESKALES et al, 2013;GEERTS et al, 2019); (ii) external information (FAN et al, 2009;CHU et al, 2015;HEIDARI et al, 2020) Methods that rely on ICs assume the majority of input data to be clean and perform repairs Definition. Given an inconsistent database instance r and a consistent database instance r ′ , the principle of minimality states that their symmetric difference r ⊕ r ′ is minimal with respect to set inclusion.…”
Section: Holistic Data Cleaningmentioning
confidence: 99%