Ensuring data quality is a growing challenge, particularly when emerging big data applications. This chapter highlights data quality concepts, terminologies, techniques, as well as research issues. Recent studies have shown that databases are often suffered from inconsistent data, which ought to be resolved in the cleaning process. Data mining techniques can play key role for ensuring data quality, which can be reutilized efficiently in data cleaning process. In this chapter, we introduce an approach for dependably generating rules from databases themselves autonomously, in order to detect data inconsistency problems from large databases. The proposed approach employs confidence and lift measures with integrity constraints to guarantee that generated rules are minimal, non-redundant and precise. Since healthcare applications are critical, and managing healthcare environments efficiently results in patient care improvement. The proposed approach is validated against several datasets from healthcare environment. It provides clinicians with automated approach for enhancing quality of electronic medical records. We experimentally demonstrate that the proposed approach achieves significant enhancement over existing approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.