There is a growing interest in allowing users to ask questions on received results in the hope of improving the usability of database systems. This research aims at answering the so called why and why-not questions on received results w.r.t. different query settings in databases. The main goals of this research are: (i) studying the problem of answering the why and the why-not questions in databases; (ii) finding efficient strategies for answering these questions in terms of different query settings and (iii) finally, developing a framework that can take advantage of the existing data indexing and query evaluation techniques available to answer such questions in databases. We believe that the research undertaken by us can contribute towards improving the usability of traditional database systems.
I. INTRODUCTIONAfter decades of efforts made by the database community, today's systems have become highly efficient in terms of both query execution time and resource usage. However, these systems are not usable for the end users to the same degree as they are proficient in underlying data management and query evaluation [16]. These days users expect systems to be more interactive and cooperative. That is, the users are not satisfied only with receiving the result from the system, but also they want to know why the system returns only the current set of objects in the result set (i.e., the current result set does not match the user's expectation). In particular, users may want to know why a certain data object (which is unexpected) appears in the result set and similarly, why a certain data object (which is expected) does not appear in the result set. As a next step, users may also seek appropriate explanations for these questions. Any system that can provide good explanations for the above questions can be very helpful for users to understand their information needs and also make the system more interactive and transparent to the users [27], [10], [13]. At present, our traditional database systems do not provide any kind of exploratory data analysis facilities to support the above why and why not questions.There are different aspects of answering why and why-not questions in databases. These are: (i) computing provenance (also called "lineage" or "pedigree") information to describe the origins of data output and the process by which it was arrived at [4], [6] ; (ii) modifying the original database so that missing objects become part of the query output [12], [11] ; (iii) identifying the causes (e.g., operator(s) in the user submitted query) that filters the desired query output [5]; and (iv) modifying the query so that missing objects appear in