Static Analysis Alert Audits: Lexicon &amp; Rules

Svoboda, David; Flynn, Lori; Snavely, Will

doi:10.1109/secdev.2016.018

Cited by 4 publications

(3 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One example of insufficient labeled data as a barrier comes from our previous work with 3 large organizations that do software development, where lack of data covering more types of flaws resulted in classifier incorporation being impractical for them [2]. Even when an organization has large audit archives, if the auditors have not used a consistent set of audit rules and a well-defined auditing lexicon, the data may not be useful (and most organizations don't have a well-defined auditing lexicon and auditing rules) [28]. Data-rich Google developed 85% accurate classifier models predicting FindBugs false positives [27].…”

Section: A Related Workmentioning

confidence: 99%

“…However, that work was only able to develop accurate classifiers for 3 CERT C coding rules with single rule data, despite using a significant quantity of audit archives. Those audit archives include data from 8 years of CERT analysis on 26 codebases, plus new audit data provided by three collaborating organizations over the course of a year where the collaborators audited SA alerts for their own codebases using an auditing lexicon and auditing rules we developed [28]. Our current work addresses that labeled-data quantity problem.…”

Section: A Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Test Suites as a Source of Training Data for Static Analysis Alert Classifiers

Flynn

Snavely

Kurtz

2021

2021 IEEE/ACM International Conference on Automation of Software Test (AST)

Self Cite

View full text Add to dashboard Cite

Flaw-finding static analysis tools typically generate large volumes of code flaw alerts including many false positives. To save on human effort to triage these alerts, a significant body of work attempts to use machine learning to classify and prioritize alerts. Identifying a useful set of training data, however, remains a fundamental challenge in developing such classifiers in many contexts. We propose using static analysis test suites (i.e., repositories of "benchmark" programs that are purposebuilt to test coverage and precision of static analysis tools) as a novel source of training data. In a case study, we generated a large quantity of alerts by executing various static analyzers on the Juliet C/C++ test suite, and we automatically derived ground truth labels for these alerts by referencing the Juliet test suite metadata. Finally, we used this data to train classifiers to predict whether an alert is a false positive. Our classifiers obtained high precision (90.2%) and recall (88.2%) for a large number of code flaw types on a hold-out test set. This preliminary result suggests that pre-training classifiers on test suite data could help to jumpstart static analysis alert classification in data-limited contexts.

show abstract

Section: A Related Workmentioning

confidence: 99%

Section: A Related Workmentioning

confidence: 99%

Test Suites as a Source of Training Data for Static Analysis Alert Classifiers

Flynn

Snavely

Kurtz

2021

2021 IEEE/ACM International Conference on Automation of Software Test (AST)

Self Cite

View full text Add to dashboard Cite

show abstract

“…In general there are 40 alarms for every thousand lines of code [16], and 35% to 91% of alarms are false positives [66]. Partitioning alarms into false positives and errors requires manual inspection which is tedious, time-consuming [42,123,149,160], and can be error-prone [42]. The large number of alarms reported by the tools and cost involved in their manual inspection have been observed to be the major reasons for the underuse of static analysis tools in practice [16,31,73,92,96].…”

Section: Introductionmentioning

confidence: 99%

Survey of Approaches for Postprocessing of Static Analysis Alarms

Muske

Serebrenik

2022

ACM Comput. Surv.

View full text Add to dashboard Cite

Static analysis tools have showcased their importance and usefulness in automated detection of defects. However, the tools are known to generate a large number of alarms which are warning messages to the user. The large number of alarms and cost incurred by their manual inspection have been identified as two major reasons for underuse of the tools in practice. To address these concerns plentitude of studies propose postprocessing of alarms: processing the alarms after they are generated. These studies differ greatly in their approaches to postprocess alarms. A comprehensive overview of the postprocessing approaches is, however, missing. In this article, we review 130 primary studies that propose postprocessing of alarms. The studies are collected by combining keywords-based database search and snowballing. We categorize approaches proposed by the collected studies into six main categories: clustering, ranking, pruning, automated elimination of false positives, combination of static and dynamic analyses, and simplification of manual inspection. We provide overview of the categories and sub-categories identified for them, their merits and shortcomings, and different techniques used to implement the approaches. Furthermore, we provide (1) guidelines for selection of the postprocessing techniques by the users/designers of static analysis tools; and (2) directions that can be explored by the researchers.

show abstract