2016 IEEE Cybersecurity Development (SecDev) 2016
DOI: 10.1109/secdev.2016.018
|View full text |Cite
|
Sign up to set email alerts
|

Static Analysis Alert Audits: Lexicon & Rules

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…One example of insufficient labeled data as a barrier comes from our previous work with 3 large organizations that do software development, where lack of data covering more types of flaws resulted in classifier incorporation being impractical for them [2]. Even when an organization has large audit archives, if the auditors have not used a consistent set of audit rules and a well-defined auditing lexicon, the data may not be useful (and most organizations don't have a well-defined auditing lexicon and auditing rules) [28]. Data-rich Google developed 85% accurate classifier models predicting FindBugs false positives [27].…”
Section: A Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…One example of insufficient labeled data as a barrier comes from our previous work with 3 large organizations that do software development, where lack of data covering more types of flaws resulted in classifier incorporation being impractical for them [2]. Even when an organization has large audit archives, if the auditors have not used a consistent set of audit rules and a well-defined auditing lexicon, the data may not be useful (and most organizations don't have a well-defined auditing lexicon and auditing rules) [28]. Data-rich Google developed 85% accurate classifier models predicting FindBugs false positives [27].…”
Section: A Related Workmentioning
confidence: 99%
“…However, that work was only able to develop accurate classifiers for 3 CERT C coding rules with single rule data, despite using a significant quantity of audit archives. Those audit archives include data from 8 years of CERT analysis on 26 codebases, plus new audit data provided by three collaborating organizations over the course of a year where the collaborators audited SA alerts for their own codebases using an auditing lexicon and auditing rules we developed [28]. Our current work addresses that labeled-data quantity problem.…”
Section: A Related Workmentioning
confidence: 99%
“…In general there are 40 alarms for every thousand lines of code [16], and 35% to 91% of alarms are false positives [66]. Partitioning alarms into false positives and errors requires manual inspection which is tedious, time-consuming [42,123,149,160], and can be error-prone [42]. The large number of alarms reported by the tools and cost involved in their manual inspection have been observed to be the major reasons for the underuse of static analysis tools in practice [16,31,73,92,96].…”
Section: Introductionmentioning
confidence: 99%