Timotej Kapus scite author profile

Cadar

2017

Abstract-Symbolic execution has attracted significant attention in recent years, with applications in software testing, security, networking and more. Symbolic execution tools, like CREST, KLEE, FuzzBALL, and Symbolic PathFinder, have enabled researchers and practitioners to experiment with new ideas, scale the technique to larger applications and apply it to new application domains. Therefore, the correctness of these tools is of critical importance.In this paper, we present our experience extending compiler testing techniques to find errors in both the concrete and symbolic execution components of symbolic execution engines. The approach used relies on a novel way to create program versions, in three different testing modes-concrete, single-path and multi-path-each exercising different features of symbolic execution engines. When combined with existing program generation techniques and appropriate oracles, this approach enables differential testing within a single symbolic execution engine.We have applied our approach to the KLEE, CREST and FuzzBALL symbolic execution engines, where it has discovered 20 different bugs exposing a variety of important errors having to do with the handling of structures, division, modulo, casting, vector instructions and more, as well as issues related to constraint solving, compiler optimisations and test input replay.

A segmented memory model for symbolic execution

Cadar

2019

Symbolic execution is an effective technique for exploring paths in a program and reasoning about all possible values on those paths. However, the technique still struggles with code that uses complex heap data structures, in which a pointer is allowed to refer to more than one memory object. In such cases, symbolic execution typically forks execution into multiple states, one for each object to which the pointer could refer. In this paper, we propose a technique that avoids this expensive forking by using a segmented memory model. In this model, memory is split into segments, so that each symbolic pointer refers to objects in a single segment. The size of the segments are bound by a threshold, in order to avoid expensive constraints. This results in a memory model where forking due to symbolic pointer dereferences is significantly reduced, often completely. We evaluate our segmented memory model on a mix of whole program benchmarks (such as m4 and make) and library benchmarks (such as SQLite), and observe significant decreases in execution time and memory usage. CCS CONCEPTS • Software and its engineering → Software testing and debugging.

Computing summaries of string loops in C for better testing and refactoring

Ish-Shalom

Itzhaky

et al. 2019

Analysing and comprehending C programs that use strings is hard: Using standard library functions for manipulating strings is not enforced and programs often use complex loops for the same purpose. We introduce the notion of memoryless loops that capture some of these string loops and present a counterexample-guided inductive synthesis approach to summarise memoryless string loops using C standard library functions, which has applications to testing, optimization and refactoring.We prove our summarization is correct for arbitrary input strings and evaluate it on a database of loops we gathered from a set of 13 open-source programs. Our approach can summarize over two thirds of memoryless loops in less than 5 minutes of computation time per loop. We then show that these summaries can be used to (1) enhance symbolic execution testing, where we observed median speedups of 79x when employing a string constraint solver, (2) optimize native code, where certain summarizations led to significant performance gains, and (3) refactor code, where we had several patches accepted in the codebases of popular applications such as patch and wget.

FAUSTA: Scaling Dynamic Analysis with Traffic Generation at WhatsApp

Mao¹,

Kapus²,

Petrou³

et al. 2022

Constraints in Dynamic Symbolic Execution: Bitvectors or Integers?

Nowack

Cadar

2019

Dynamic symbolic execution is a technique that analyses programs by gathering mathematical constraints along execution paths. To achieve bit-level precision, one must use the theory of bitvectors. However, other theories might achieve higher performance, justifying in some cases the possible loss of precision. In this paper, we explore the impact of using the theory of integers on the precision and performance of dynamic symbolic execution of C programs. In particular, we compare an implementation of the symbolic executor KLEE using a partial solver based on the theory of integers, with a standard implementation of KLEE using a solver based on the theory of bitvectors, both employing the popular SMT solver Z3. To our surprise, our evaluation on a synthetic sort benchmark, the ECA set of Test-Comp 2019 benchmarks, and GNU Coreutils revealed that for most applications the integer solver did not lead to any loss of precision, but the overall performance difference was rarely significant.