This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combines semantic parsing techniques from the NLP community with type-directed program synthesis and automated program repair. Starting with a program sketch obtained using standard parsing techniques, our approach involves an iterative refinement loop that alternates between quantitative type inhabitation and automated sketch repair. We use the proposed idea to build an end-to-end system called Sqlizer that can synthesize SQL queries from natural language. Our method is fully automated, works for any database without requiring additional customization, and does not require users to know the underlying database schema. We evaluate our approach on over 450 natural language queries concerning three different databases, namely MAS, IMDB, and YELP. Our experiments show that the desired query is ranked within the top 5 candidates in close to 90% of the cases and that Sqlizer outperforms Nalir, a state-of-the-art tool that won a best paper award at VLDB'14. CCS Concepts: • Human-centered computing → Natural language interfaces; • Software and its engineering → Domain specific languages; • Information systems → Relational database query languages; Database utilities and tools; Extraction, transformation and loading;
Abstract. We describe a symbolic heap abstraction that unifies reasoning about arrays, pointers, and scalars, and we define a fluid update operation on this symbolic heap that relaxes the dichotomy between strong and weak updates. Our technique is fully automatic, does not suffer from the kind of state-space explosion problem partition-based approaches are prone to, and can naturally express properties that hold for non-contiguous array elements. We demonstrate the effectiveness of this technique by evaluating it on challenging array benchmarks and by automatically verifying buffer accesses and dereferences in five Unix Coreutils applications with no annotations or false alarms.
We present a new, precise technique for fully path-and contextsensitive program analysis. Our technique exploits two observations: First, using quantified, recursive formulas, path-and contextsensitive conditions for many program properties can be expressed exactly. To compute a closed form solution to such recursive constraints, we differentiate between observable and unobservable variables, the latter of which are existentially quantified in our approach. Using the insight that unobservable variables can be eliminated outside a certain scope, our technique computes satisfiabilityand validity-preserving closed-form solutions to the original recursive constraints. We prove the solution is as precise as the original system for answering may and must queries as well as being small in practice, allowing our technique to scale to the entire Linux kernel, a program with over 6 million lines of code.
A minimum satisfying assignment of a formula is a minimumcost partial assignment of values to the variables in the formula that guarantees the formula is true. Minimum satisfying assignments have applications in software and hardware verification, electronic design automation, and diagnostic and abductive reasoning. While the problem of computing minimum satisfying assignments has been widely studied in propositional logic, there has been no work on computing minimum satisfying assignments for richer theories. We present the first algorithm for computing minimum satisfying assignments for satisfiability modulo theories. Our algorithm can be used to compute minimum satisfying assignments in theories that admit quantifier elimination, such as linear arithmetic over reals and integers, bitvectors, and difference logic. Since these richer theories are commonly used in software verification, we believe our algorithm can be gainfully used in many verification approaches.
We propose a novel, sound, and complete Simplex-based algorithm for solving linear inequalities over integers. Our algorithm, which can be viewed as a semantic generalization of the branch-and-bound technique, systematically discovers and excludes entire subspaces of the solution space containing no integer points. Our main insight is that by focusing on the defining constraints of a vertex, we can compute a proof of unsatisfiability for the intersection of the defining constraints and use this proof to systematically exclude subspaces of the feasible region with no integer points. We show experimentally that our technique significantly outperforms the top four competitors in the QF-LIA category of the SMT-COMP '08 when solving linear inequalities over integers.
We present an overview of the Saturn program analysis system, including a rationale for three major design decisions: the use of function-at-a-time, or summary-based, analysis, the use of constraints, and the use of a logic programming language to express program analysis algorithms. We argue that the combination of summaries and constraints allows Saturn to achieve both great scalability and great precision, while the use of a logic programming language with constraints allows for succinct, high-level expression of program analyses.
This paper presents a new method for generating inductive loop invariants that are expressible as boolean combinations of linear integer constraints. The key idea underlying our technique is to perform a backtracking search that combines Hoare-style verification condition generation with a logical abduction procedure based on quantifier elimination to speculate candidate invariants. Starting with true, our method iteratively strengthens loop invariants until they are inductive and strong enough to verify the program. A key feature of our technique is that it is lazy: It only infers those invariants that are necessary for verifying program correctness. Furthermore, our technique can infer arbitrary boolean combinations (including disjunctions) of linear invariants. We have implemented the proposed approach in a tool called HOLA. Our experiments demonstrate that HOLA can infer interest-
Containers are general-purpose data structures that provide functionality for inserting, reading, removing, and iterating over elements. Since many applications written in modern programming languages, such as C++ and Java, use containers as standard building blocks, precise analysis of many programs requires a fairly sophisticated understanding of container contents. In this paper, we present a sound, precise, and fully automatic technique for static reasoning about contents of containers. We show that the proposed technique adds useful precision for verifying real C++ applications and that it scales to applications with over 100,000 lines of code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.