A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules. An important class of design rules deals with sequences of events asso-ciated with a set of related objects. This paper presents a language called PQL (Program Query Language) that allows programmers to express such questions easily in an application-specific context. A query looks like a code excerpt corresponding to the shortest amount of code that would violate a design rule. Details of the tar-get application's precise implementation are abstracted away. The programmer may also specify actions to perform when a match is found, such as recording relevant information or even correcting an erroneous execution on the fly.We have developed both static and dynamic techniques to find solutions to PQL queries. Our static analyzer finds all potential matches conservatively using a context-sensitive, flow-insensitive, inclusion-based pointer alias analysis. Static results are also use-ful in reducing the number of instrumentation points for dynamic analysis. Our dynamic analyzer instruments the source program to catch all violations precisely as the program runs and to optionally perform user-specified actions.We have implemented the techniques described in this paper and found 206 errors in 6 large real-world open-source Java applica-tions containing a total of nearly 60,000 classes. These errors are important security flaws, resource leaks, and violations of consis-tency invariants. The combination of static and dynamic analysis proves effective at addressing a wide range of debugging and pro-gram comprehension queries. We have found that dynamic analysis is especially suitable for preventing errors such as security vulner-abilities at runtime.
Modern programming languages, ranging from Haskell and ML, to JavaScript, C# and Java, all make extensive use of higher-order state. This paper advocates a new verification methodology for higher-order stateful programs, based on a new monad of predicate transformers called the Dijkstra monad.Using the Dijkstra monad has a number of benefits. First, the monad naturally yields a weakest pre-condition calculus. Second, the computed specifications are structurally simpler in several ways, e.g., single-state post-conditions are sufficient (rather than the more complex two-state post-conditions). Finally, the monad can easily be varied to handle features like exceptions and heap invariants, while retaining the same type inference algorithm.We implement the Dijkstra monad and its type inference algorithm for the F programming language. Our most extensive case study evaluates the Dijkstra monad and its F implementation by using it to verify JavaScript programs.Specifically, we describe a tool chain that translates programs in a subset of JavaScript decorated with assertions and loop invariants to F . Once in F , our type inference algorithm computes verification conditions and automatically discharges their proofs using an SMT solver. We use our tools to prove that a core model of the JavaScript runtime in F respects various invariants and that a suite of JavaScript source programs are free of runtime errors.
Many tools allow programmers to develop applications in highlevel languages and deploy them in web browsers via compilation to JavaScript. While practical and widely used, these compilers are ad hoc: no guarantee is provided on their correctness for whole programs, nor their security for programs executed within arbitrary JavaScript contexts. This paper presents a compiler with such guarantees. We compile an ML-like language with higher-order functions and references to JavaScript, while preserving all source program properties. Relying on type-based invariants and applicative bisimilarity, we show full abstraction: two programs are equivalent in all source contexts if and only if their wrapped translations are equivalent in all JavaScript contexts. We evaluate our compiler on sample programs, including a series of secure libraries.This version supercedes the version in the official proceedings of POPL '13. In particular, we fix upfun in Figure 4, rectifying an experimental error that rendered the previous upfun an insufficient protection on several popular browsers. The new version is confirmed to work on IE 9, IE 10, Chrome 23, and Firefox 16. Thanks to Karthik Bhargavan for pointing out the error.
A great deal of research on sanitizer placement, sanitizer correctness, checking path validity, and policy inference, has been done in the last five to ten years, involving type systems, static analysis and runtime monitoring and enforcement. However, in pretty much all work thus far, the burden on sanitizer placement has fallen on the developer. However, sanitizer placement in large-scale applications is difficult, and developers are likely to make errors, and thus create security vulnerabilities. This paper advocates a radically different approach: we aim to fully automate the placement of sanitizers by analyzing the flow of tainted data in the program. We argue that developers are better off leaving out sanitizers entirely instead of trying to place them. This paper proposes a fully automatic technique for sanitizer placement. Placement is static whenever possible, switching to run time when necessary. Run-time taint tracking techniques can be used to track the source of a value, and thus apply appropriate sanitization. However, due to the runtime overhead of run-time taint tracking, our technique avoids it wherever possible.
String-manipulating programs are an important class of programs with applications in malware detection, graphics, input sanitization for Web security, and large-scale HTML processing. This paper extends prior work on BEK, an expressive domain-specific language for writing string-manipulating programs, with algorithmic insights that make BEK both analyzable and data-parallel. By analyzable we mean that unlike most general purpose programming languages, many algebraic properties of a BEK program are decidable (i.e., one can check whether two programs commute or compute the inverse of a program). By data-parallel we mean that a BEK program can compute on arbitrary subsections of its input in parallel, thus exploiting parallel hardware. This latter requirement is particularly important for programs which operate on large data: without data parallelism, a programmer cannot hide the latency of reading data from various storage media (i.e., reading a terabyte of data from a modern hard drive takes about 3 hours). With a data-parallel approach, the system can split data across multiple disks and thus hide the latency of reading the data.A BEK program is expressive: a programmer can use conditionals, switch statements, and registers-or local variables-in order to implement common string-manipulating programs. Unfortunately, this expressivity induces data dependencies, which are an obstacle to parallelism. The key contribution of this paper is an algorithm which automatically removes these data dependencies by mapping a BEK program into a intermediate format consisting of symbolic transducers, which extend classical transducers with symbolic predicates and symbolic assignments. We present a novel algorithm that we call exploration which performs symbolic loop unrolling of these transducers to obtain simplified versions of the original program. We show how these simplified versions can then be lifted to a stateless form, and from there compiled to data-parallel hardware.To evaluate the efficacy of our approach, we demonstrate up to 8x speedups for a number of real-world, BEK programs, (e.g., HTML encoder and decoder) on data-parallel hardware. To the best of our knowledge, these are the first data parallel implementation of these programs. To validate that our approach is correct, we use an automatic testing technique to compare our generated code to the original implementations and find no semantic deviations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.