Strings are widely used in modern programming languages in various scenarios. For instance, strings are used to build up Structured Query Language (SQL) queries that are then executed. Malformed strings may lead to subtle bugs, as well as non-sanitized strings may raise security issues in an application. For these reasons, the application of static analysis to compute safety properties over string values at compile time is particularly appealing. In this article, we propose a generic approach for the static analysis of string values based on abstract interpretation. In particular, we design a suite of abstract semantics for strings, where each abstract domain tracks a different kind of information. We discuss the trade-off between efficiency and accuracy when using such domains to catch the properties of interest. In this way, the analysis can be tuned at different levels of precision and efficiency, and it can address specific properties. The concrete semantics of these operators is approximated in different ways by the five different abstract domains. In addition, after 30 years of practice with numerical domains, it is clear that a monolithic domain precise on any program and property (e.g., Polyhedra [14]) gives up in terms of efficiency, whereas to achieve scalability, we need specific approximations on a given property (e.g., Pentagons [17]) or class of programs (e.g., ASTRÉE [18]). With this scenario in mind, we develop several domains inside the same framework to tune the analysis at different levels of precision and efficiency with respect to the analyzed class of programs and property. Other abstractions are possible and welcomed, and we expect our framework to be generic enough to support them.This article ‡ is structured as follows. In the rest of this section, we introduce two running examples, and we recall some basic concepts of abstract interpretation. Section 2 defines the syntax of the string operators that we will consider in the rest of the article. Section 3 introduces their concrete semantics, whereas in Section 4, the five abstract domains and the core of this work are formalized and are used to analyze the running examples. In Section 5, more experimental results are presented. Finally, Section 6 discusses the related work, and Section 7 concludes. Running examplesThroughout all the article, we will always refer to the two examples reported in Figures 1(a) and 1(b).The first Java program, prog1, is taken from [7], and it dynamically builds an SQL query by concatenating several strings. One of these concatenations applies only if a given input value, unknown at compile time, is not null. We are interested in checking if the SQL query resulting by ‡ The article is a fully revised and extended version of [19]
Abstract. In this paper we propose a unifying approach for the static analysis of string values based on abstract interpretation, and we present several abstract domains that track different types of information. In this way, the analysis can be tuned at different levels of precision and efficiency, and it can address specific properties.
Abstract. Effective static analyses must precisely approximate both heap structure and information about values. During the last decade, shape analysis has obtained great achievements in the field of heap abstraction. Similarly, numerical and other value abstractions have made tremendous progress, and they are effectively applied to the analysis of industrial software. In addition, several generic static analyzers have been introduced. These compositional analyzers combine many types of abstraction into the same analysis to prove various properties. The main contribution of this paper is the combination of Sample, an existing generic analyzer, with a TVLA-based heap abstraction (TVAL+).
We introduce an enhanced information-flow analysis for tracking the amount of confidential data that is possibly released to third parties by a mobile application. The main novelty of our solution is that it can explicitly keep track of the footprint of data sources in the expressions formed and manipulated by the program, as well as of transformations over them, yielding a lazy approach with finer granularity, which may reduce false positives with respect to state-of-the-art information-flow analyses
The aim of this paper is to provide a general overview of the product operators introduced in the literature as a tool to enhance the analysis accuracy in the Abstract Interpretation framework. In particular we focus on the Cartesian and reduced products, as well as on the reduced cardinal power, an under-used technique whose features deserve to be stressed for their potential impact in practical applications
This paper addresses the problem of detecting JavaScript security vulnerabilities in the client side of Web applications. Such vulnerabilities are becoming a source of growing concern due to the rapid migration of server-side business logic to the client side, combined with new JavaScriptbacked Web technologies, such as AJAX and HTML5. Detection of client-side vulnerabilities is challenging given the dynamic and event-driven nature of JavaScript. We present a hybrid form of JavaScript analysis, which augments static analysis with (semi-)concrete information by applying partial evaluation to JavaScript functions according to dynamic data recorded by the Web crawler. The dynamic component rewrites the program per the enclosing HTML environment, and the static component then explores all possible behaviors of the partially evaluated program (while treating usercontrolled aspects of the environment conservatively).We have implemented this hybrid architecture as the JSA analysis tool, which is integrated into the IBM AppScan Standard Edition product. We formalize the static analysis and prove useful properties over it. We also tested the system across a set of 170,000 Web pages, comparing it with purely static and dynamic alternatives. The results provide conclusive evidence in favor of our hybrid approach. Only 10% of the reports by JSA are false alarms compared to 63% of the alarms flagged by its purely static counterpart, while not a single true warning is lost. This represents a reduction of 94% in false alarms. Compared with a commercial testing algorithm, JSA detects vulnerabilities in > 4x more Web sites with only 4 false alarms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.