Xujie Si scite author profile

The problem of learning logical rules from examples arises in diverse fields, including program synthesis, logic programming, and machine learning. Existing approaches either involve solving computationally difficult combinatorial problems, or performing parameter estimation in complex statistical models. In this paper, we present DIFFLOG, a technique to extend the logic programming language Datalog to the continuous setting. By attaching real-valued weights to individual rules of a Datalog program, we naturally associate numerical values with individual conclusions of the program. Analogous to the strategy of numerical relaxation in optimization problems, we can now first determine the rule weights which cause the best agreement between the training labels and the induced values of output tuples, and subsequently recover the classical discrete-valued target program from the continuous optimum. We evaluate DIFFLOG on a suite of 34 benchmark problems from recent literature in knowledge discovery, formal verification, and database query-byexample, and demonstrate significant improvements in learning complex programs with recursive rules, invented predicates, and relations of arbitrary arity.

show abstract

Syntax-guided synthesis of Datalog programs

Lee

Zhang

et al. 2018

View full text Add to dashboard Cite

Effective interactive resolution of static analysis alarms

Zhang

Grigore

et al. 2017

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

We propose an interactive approach to resolve static analysis alarms. Our approach synergistically combines a sound but imprecise analysis with precise but unsound heuristics, through user interaction. In each iteration, it solves an optimization problem to find a set of questions for the user such that the expected payoff is maximized. We have implemented our approach in a tool, Ursa, that enables interactive alarm resolution for any analysis specified in the declarative logic programming language Datalog. We demonstrate the effectiveness of Ursa on a state-of-the-art static datarace analysis using a suite of 8 Java programs comprising 41-194 KLOC each. Ursa is able to eliminate 74% of the false alarms per benchmark with an average payoff of 12× per question. Moreover, Ursa prioritizes user effort effectively by posing questions that yield high payoffs earlier.

show abstract

Code2Inv: A Deep Learning Framework for Program Verification

Naik

Dai

et al. 2020

View full text Add to dashboard Cite

We propose a general end-to-end deep learning framework Code2Inv, which takes a verification task and a proof checker as input, and automatically learns a valid proof for the verification task by interacting with the given checker. Code2Inv is parameterized with an embedding module and a grammar: the former encodes the verification task into numeric vectors while the latter describes the format of solutions Code2Inv should produce. We demonstrate the flexibility of Code2Inv by means of two small-scale yet expressive instances: a loop invariant synthesizer for C programs, and a Constrained Horn Clause (CHC) solver.

show abstract

Automated black-box detection of access control vulnerabilities in web applications

2014

View full text Add to dashboard Cite

Access control vulnerabilities within web applications pose serious security threats to the sensitive information stored at back-end databases. Existing approaches are limited from several aspects, including the coarse granularity at which the access control is modeled, the incapability of handling complex relationship between data entities and the requirement of source code and the specific application platform. In this paper, we present an automated black-box technique for identifying a broad range of access control vulnerabilities, which can be applied to applications that are developed using different languages and platforms. We model the access control policy based on a novel virtual SQL query concept, which captures both the database access operations (i.e., through SQL queries) and the post-processing filters within the web application. We leverage a crawler to automatically explore the application and collect execution traces. From the traces, we identify the set of database access operations that are allowed for each role (i.e., role-level policy inference) and extract the constraints over the operation parameters to characterize the relationship between the users and the accessed data (i.e., user-level policy inference). Based on the inferred policy, we construct test inputs to exploit the application for potential access control flaws. We implement a prototype system BATMAN and evaluate it over a set of PHP and JSP web applications. The experiment results demonstrate the effectiveness and accuracy of our approach.

show abstract

On Incremental Core-Guided MaxSAT Solving

Zhang

Manquinho

et al. 2016

View full text Add to dashboard Cite

Continuously reasoning about programs using differential Bayesian inference

Heo

Raghothaman

et al. 2019

View full text Add to dashboard Cite

Programs often evolve by continuously integrating changes from multiple programmers. The effective adoption of program analysis tools in this continuous integration setting is hindered by the need to only report alarms relevant to a particular program change. We present a probabilistic framework, Drake, to apply program analyses to continuously evolving programs. Drake is applicable to a broad range of analyses that are based on deductive reasoning. The key insight underlying Drake is to compute a graph that concisely and precisely captures differences between the derivations of alarms produced by the given analysis on the program before and after the change. Performing Bayesian inference on the graph thereby enables to rank alarms by likelihood of relevance to the change. We evaluate Drake using SparrowÐa static analyzer that targets buffer-overrun, format-string, and integer-overflow errorsÐon a suite of ten widely-used C programs each comprising 13kś112k lines of code. Drake enables to discover all true bugs by inspecting only 30 alarms per benchmark on average, compared to 85 (3× more) alarms by the same ranking approach in batch mode, and 118 (4× more) alarms by a differential approach based on syntactic masking of alarms which also misses 4 of the 26 bugs overall. CCS Concepts • Software and its engineering → Automated static analysis; Software evolution; • Mathematics of computing → Bayesian networks.

show abstract

Combining the logical and the probabilistic in program analysis

Zhang

Naik

2017

View full text Add to dashboard Cite

Conventional program analyses have made great strides by leveraging logical reasoning. However, they cannot handle uncertain knowledge, and they lack the ability to learn and adapt. This in turn hinders the accuracy, scalability, and usability of program analysis tools in practice. We seek to address these limitations by proposing a methodology and framework for incorporating probabilistic reasoning directly into existing program analyses that are based on logical reasoning. We demonstrate that the combined approach can benefit a number of important applications of program analysis and thereby facilitate more widespread adoption of this technology.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xujie Si

Synthesizing Datalog Programs using Numerical Relaxation

Syntax-guided synthesis of Datalog programs

Effective interactive resolution of static analysis alarms

Code2Inv: A Deep Learning Framework for Program Verification

Automated black-box detection of access control vulnerabilities in web applications

On Incremental Core-Guided MaxSAT Solving

Continuously reasoning about programs using differential Bayesian inference

Combining the logical and the probabilistic in program analysis

Contact Info

Product

Resources

About