Abstract-We present feature selection algorithms for multilayer Perceptrons (MLPs) and multi-class Support Vector Machines (SVMs), using mutual information between class labels and classifier outputs, as an objective function. This objective function involves inexpensive computation of information measures only on discrete variables; provides immunity to prior class probabilities; and brackets the probability of error of the classifier. The Maximum Output Information (MOI) algorithms employ this function for feature subset selection by greedy elimination and directed search. The output of the MOI algorithms is a feature subset of user-defined size and an associated trained classifier(MLP/SVM). These algorithms compare favorably with a number of other methods in terms of performance on various artificial and real-world data sets.
Abstract. Identifying relevant features for a classification task is an important issue in machine learning. In this paper, we present a feature crediting scheme for multiclass pattern recognition tasks, that utilizes the ability of Support Vector Machines to generalize well in high dimensional feature spaces. Support Vector learning identifies a small subset of training data relevant for the classification task. They primarily tackle the binary classification problem. This scheme uses relevant examples to identify relevant features for multi-class classification. We present, and employ for this purpose, an informationtheoretic measure of classifier performance. This measure addresses the key issue of average rate of information being delivered by the classifier. It provides immunity to sampling bias in the data and sensitivity to pattern of errors made by the classifier. Empirical results on a number of datasets suggest efficient applicability to data with a very large number of features.
The advanced malware continue to be a challenge in digital world that signature-based detection techniques fail to conquer. The malware use many anti-detection techniques to mutate. Thus no virus scanner can claim complete malware detection even for known malware. Static and dynamic analysis techniques focus upon different kinds of malware such as Evasive or Metamorphic malware. This paper proposes a comprehensive approach that combines static checking and dynamic analysis for malware detection. Static analysis is used to check the specific code characteristics. Dynamic analysis is used to analyze the runtime behavior of malware. The authors propose a framework for the automated analysis of an executable's behavior using text mining. Text mining of dynamic attributes identifies the important features for classifying the executable as benign and malware. The synergistic combination proposed in this paper allows detection of not only known variants of malware but even the obfuscated, packed and unknown malware variants and malware evasive to dynamic analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.