Intertwined developments between program attacks and defenses witness the evolution of program anomaly detection methods. Emerging categories of program attacks, e.g., non-control data attacks and data-oriented programming, are able to comply with normal trace patterns at local views. This article points out the deficiency of existing program anomaly detection models against new attacks and presents long-span behavior anomaly detection (LAD), a model based on mildly context-sensitive grammar verification. The key feature of LAD is its reasoning of correlations among arbitrary events that occurred in long program traces. It extends existing correlation analysis between events at a stack snapshot, e.g., paired call and ret, to correlation analysis among events that historically occurred during the execution. The proposed method leverages specialized machine learning techniques to probe normal program behavior boundaries in vast high-dimensional detection space. Its two-stage modeling/detection design analyzes event correlation at both binary and quantitative levels. Our prototype successfully detects all reproduced real-world attacks against sshd, libpcre, and sendmail. The detection procedure incurs 0.1 ms to 1.3 ms overhead to profile and analyze a single behavior instance that consists of tens of thousands of function call or system call events. North Glebe Road, Arlington, VA 22203; email: naren@cs.vt.edu; T. Jaeger, 346A IST Building, University Park, PA 16802; email: tjaeger@cse.psu.edu. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. 12:2 X. Shu et al.
INTRODUCTIONAttacks to programs are one of the oldest and fundamental threats to computing systems, which evolve and constitute latest attack vectors and advanced persistent threats. Anomaly-based intrusion detection discovers aberrant executions caused by attacks, misconfigurations, program bugs, and unusual usage patterns. The approach models normal program behaviors instead of the threats. It does not bear time lags between emerging attacks and deployed countermeasures as standard defenses do, which are built upon retrospects of inspected attacks. The merit of program anomaly detection is its independence from attack signatures. This property potentially enables proactive defenses against new and unknown threats.The detection accuracy of program anomaly detection methods relies on the precision of normal program behavior description and the completeness of the training [54]. Primitive program attacks, e.g., return addresses mani...