Jaffer Gardezi scite author profile

Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of certain attribute values in pairs of database tuples when some similarity conditions on other values are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MD enforcement, an arbitrary value from the underlying data domain can be used for the value in common that is used for a matching. However, the overall number of changes of attribute values is expected to be kept to a minimum. We investigate this case in terms of semantics and the properties of data cleaning through the enforcement of MDs. We characterize the intended clean instances, and also the clean answers to queries, as those that are invariant under the cleaning process. The complexity of computing clean instances and clean query answering is investigated. Tractable and intractable cases depending on the MDs are identified and characterized.

show abstract

Tractable Cases of Clean Query Answering under Entity Resolution via Matching Dependencies

Gardezi

Bertossi

2012

View full text Add to dashboard Cite

Abstract. Matching Dependencies (MDs) are a recent proposal for declarative entity resolution. They are rules that specify, given the similarities satisfied by values in a database, what values should be considered duplicates, and have to be matched. On the basis of a chase-like procedure for MD enforcement, we can obtain clean (duplicate-free) instances; possibly several of them. The clean answers to queries (which we call the resolved answers) are invariant under the resulting class of instances. Identifying the clean versions of a given instance is generally an intractable problem. In this paper, we show that for a certain class of MDs, the characterization of the clean instances is straightforward. This is an important result, because it leads to tractable cases of resolved query answering. Further tractable cases are derived by making connections with tractable cases of CQA.

show abstract

Query Rewriting Using Datalog for Duplicate Resolution

Gardezi

Bertossi

2012

View full text Add to dashboard Cite

Matching Dependencies (MDs) are a recent proposal for declarative entity resolution. They are rules that specify, given the similarities satisfied by values in a database, what values should be considered duplicates, and have to be matched. On the basis of a chase-like procedure for MD enforcement, we can obtain clean (duplicate-free) instances; actually possibly several of them. The clean answers to queries (which we call the resolved answers) are invariant under the resulting class of instances. In this paper, we investigate a query rewriting approach to obtaining the resolved answers (for certain classes of queries and MDs). The rewritten queries are specified in stratified Datalog not,s with aggregation. In addition to the rewriting algorithm, we discuss the semantics of the rewritten queries, and how they could be implemented by means of a DBMS.

show abstract

Tractable vs. Intractable Cases of Query Answering under Matching Dependencies

Bertossi

Gardezi

2014

View full text Add to dashboard Cite

Abstract. Matching Dependencies (MDs) are a recent proposal for declarative entity resolution. They are rules that specify, on the basis of similarities satisfied by values in a database, what values should be considered duplicates, and have to be matched. On the basis of a chase-like procedure for MD enforcement, we can obtain clean (duplicate-free), and possibly several, resolved instances. The resolved answers to a query are invariant under the class of resolved instances. Previous work identified classes of queries and sets of MDs for which resolved query answering is tractable, with special emphasis on cyclic sets of MDs. In this work we further investigate the complexity of this problem, identifying intractable cases, and exploring the frontier between tractability and intractability. We concentrate mostly on acyclic sets of MDs. For a special case we obtain a dichotomy result relative to NP-hardness.

show abstract

Matching Dependencies with Arbitrary Attribute Values: Semantics, Query Answering and Integrity Constraints

Gardezi¹,

Bertossi²,

Kiringa³

2010

Preprint

View full text Add to dashboard Cite

Matching dependencies (MDs) were introduced to specify the identification or matching of certain attribute values in pairs of database tuples when some similarity conditions are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MDs, any value from the underlying data domain can be used for the value in common that does the matching. We investigate the semantics and properties of data cleaning through the enforcement of matching dependencies for the pure case. We characterize the intended clean instances and also the clean answers to queries as those that are invariant under the cleaning process. The complexity of computing clean instances and clean answers to queries is investigated. Tractable and intractable cases depending on the MDs and queries are identified. Finally, we establish connections with database repairs under integrity constraints.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jaffer Gardezi

Matching dependencies: semantics and query answering

Tractable Cases of Clean Query Answering under Entity Resolution via Matching Dependencies

Query Rewriting Using Datalog for Duplicate Resolution

Tractable vs. Intractable Cases of Query Answering under Matching Dependencies

Matching Dependencies with Arbitrary Attribute Values: Semantics, Query Answering and Integrity Constraints

Contact Info

Product

Resources

About