When transforming data one often wants certain information in the data source to be preserved, i.e., we identify parts of the source data and require these parts to be transformed without loss of information. We characterize the preservation of selected information in terms of the notions of invertibility and query preservation, in a setting when transformations are specified as a view V (a set of queries), and source information is selected by a query Q. We investigate the problem for determining whether transformations V preserve the information selected by Q. (1) We show that the notion of invertibility coincides with view determinacy studied for query rewriting. (2) We establish the undecidability of the problem when either Q or V is in DATALOG or first-order logic, for invertibility and query preservation. (3) When Q and V are conjunctive queries (CQ), the problem is as hard as view determinacy for CQ queries and CQ views, an open problem. Nevertheless, we provide complexity bounds of the problem, either in PTIME or NP-complete, when V ranges over subclasses of CQ (i.e., SP, SC, PC), and when Q is assumed to be a minimal CQ query or not. (4) We show that CQ is complete for L-to-CQ rewriting when L is SP, SC or PC, i.e., every CQ query can be rewritten in terms of SP, SC or PC views using a query in CQ.
The problem of answering queries using views arises in a wide variety of data management applications. From the information-theoretic perspective, a notion of determinacy has been recently introduced to formalize the intuitive notion that whether a set of views V is sufficient to answer a query Q. We say that V determines Q iff for any twoDeterminacy has been investigated for many query and view languages including first order logic (FO) and unions of conjunctive queries (UCQ) and a considerable number of cases are resolved. However the problem remains open for queries and views defined by conjunctive queries (CQ) and appears to be quite challenging.In this paper we study the problem of determinacy for conjunctive queries and views over unary database schemas where each relation has only one attribute. We show that determinacy is decidable in ptime in this case. We provide syntactic characterizations for a CQ query Q to be determined by a set of CQ views V and give an algorithm for checking determinacy which runs in time O(|Q| * |V|) where |Q| and |V| are the sizes of Q and V respectively. Furthermore we show that whenever V determines Q there exists a CQ query which is an equivalent rewriting of Q using V.
Producing sentences from a grammar, according to various criteria, is required in many applications. It is also a basic building block for grammar engineering. This paper presents a toolkit for context-free grammars, which mainly consists of several algorithms for sentence generation or enumeration and for coverage analysis for context-free grammars. The toolkit deals with general context-free grammars. Besides providing implementations of algorithms, the toolkit also provides a simple graphical user interface, through which the user can use the toolkit directly. The toolkit is implemented in Java and is available at http://lcs.ios.ac.cn/ ∼ zhiwu/toolkit.php. In the paper, the overview of the toolkit and the description of the GUI are presented, and experimental results and preliminary applications of the toolkit are also contained.
Grammars, especially context-free grammars, are widely used within and even outside the field of computer science. In this paper, we present a systematic framework for grammar testing, in which some commonly used techniques for testing programs such as module testing and integration testing are adapted and applied to the testing of grammars. We propose a nonterminal-based approach for grammar modularization, combined with an iterative process for grammar testing in which a grammar is tested with respect to both a generator and a recognizer. Experiments on grammars for some non-trivial programming languages such as C and Java demonstrate the feasibility and efficiency of the testing framework and the proposed approaches.
Regular expressions are widely used within and even outside of computer science due to their expressiveness and flexibility. However, regular expressions have a quite compact and rather tolerant syntax that makes them hard to understand, hard to compose, and error-prone. Faulty regular expressions may cause failures of the applications that use them. Therefore, ensuring the correctness of regular expressions is a vital prerequisite for their use in practical applications. The importance and necessity of ensuring correct definitions of regular expressions have attracted extensive attention from researchers and practitioners, especially in recent years. In this study, we provide a review of the recent works for ensuring the correct usage of regular expressions. We classify those works into different categories, including the empirical study, test string generation, automatic synthesis and learning, static checking and verification, visual representation and explanation, and repairing. For each category, we review the main results, compare different approaches, and discuss their advantages and disadvantages. We also discuss some potential future research directions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.