Some recent advances in intrusion detection are based on detecting anomalies in program behavior, as characterized by the sequence of kernel calls the program makes. Specifically, traces of kernel calls are collected during a training period. The substrings of fixed length N (for some N) of those traces are called N-grams. The set of N-grams occurring during normal execution has been found to discriminate effectively between normal behavior of a program and the behavior of the program under attack. The N-gram characterization, while effective, requires the user to choose a suitable value for N. This paper presents an alternative characterization, as a finite state machine whose states represent predictive sequences of different lengths. An algorithm is presented to construct the finite state machine from training data, based on traditional string-processing data structures but employing some novel techniques.
KeywordsIntrusion detection, computational immunology, finite automata, string processing.
Some recent advances in intrusion detection are based on detecting anomalies in program behavior, as characterized by the sequence of kernel calls the program makes. Specifically, traces of kernel calls are collected during a training period. The substrings of fixed length N (for some N) of those traces are called N-grams. The set of N-grams occurring during normal execution has been found to discriminate effectively between normal behavior of a program and the behavior of the program under attack. The N-gram characterization, while effective, requires the user to choose a suitable value for N. This paper presents an alternative characterization, as a finite state machine whose states represent predictive sequences of different lengths. An algorithm is presented to construct the finite state machine from training data, based on traditional string-processing data structures but employing some novel techniques.
A major drawback to the use of attribute grammars in language-based editors has been that attributes can only depend on neighboring attributes in a program's syntax tree. This paper concerns new attributegrammar based methods that, for a suitable class of grammars, overcome this fundamental limitation° The techniques presented allow the updating algorithm to skip over arbitrarily large sections of the tree that more straightforward updating methods visit node by node. These techniques are then extended to deal with aggregate values, so that the attribute updating procedure need only follow dependencies due to a changed component of an aggregate value. Although our methods work only for a restricted class of attribute grammars, satisfying the necessary restrictions should not place an undue burden on the writer of the grammar.
We used literate programming on a team project to write a 33,000 line program for the Synthesizer Generator. The program, Penelope, was written using WEB, a tool designed for writing literate programs. Unlike other WEB programs, many of which have been written by WEB's developer or by individuals, Penelope was not intended to be published. We used WEB in the hope that both our team and its final product would benefit from the advantages often attributed to literate programming. The WEB source served as good internal documentation throughout development and maintenance, and it continues to document Penelope's design and implementation. Our experience also uncovered a number of problems with WEB.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.