Abstract. Preprocessing is often the most time-consuming phase in data analysis and interdependent data quality issues a cause of suboptimal modelling results. The design problem addressed in this paper is: what kind of framework can support visualization of data quality issue interdependencies for faster and more effective preprocessing? An object framework was designed that uses constructed features as a basis of visualizations. Six real datasets from business performance measurement system domain were acquired to demonstrate the implementation. The framework was found to be a viable preprocessing analysis supplement to both industry practice of exploratory data analysis and research benchmark of preprocessing combinations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.