Maintaining strictness in dimensions is important in integration of data warehouses. A dimension that satisfies all of its roll-up constraints is said to be strict, a property that is required for correct aggregation. Existing work on instance matching does not address the problem of enforcing the strictness of roll-up constraints. In this paper, we use a graph matching-based approach to dimension instance matching and propose an algorithm that enforces strictness and reduces false positives. Making use of similarity flooding, the graph matching algorithm can be greedy in identifying matching members, we propose heuristics to further reduce false positive matches and reduce false strictness. Experiments on real-world data demonstrates the effectiveness of our proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.