A number of important applied problems in engineering, finance and medicine can be formulated as a problem of anomaly detection based on a one-class classification. A classical approach to this problem is to describe a normal state using a one-class support vector machine. Then to detect anomalies we quantify a distance from a new observation to the constructed description of the normal class. In this paper we present a new approach to one-class classification. We formulate a new problem statement and a corresponding algorithm that allow taking into account privileged information during the training phase. We evaluate performance of the proposed approach using synthetic datasets, as well as the publicly available Microsoft Malware Classification Challenge dataset.
One possible approach to tackle the class imbalance in classification tasks is to resample a training dataset, i.e., to drop some of its elements or to synthesize new ones. There exist several widely-used resampling methods. Recent research showed that the choice of resampling method significantly affects the quality of classification, which raises the resampling selection problem. Exhaustive search for optimal resampling is time-consuming and hence it is of limited use. In this paper, we describe an alternative approach to the resampling selection. We follow the meta-learning concept to build resampling recommendation systems, i.e., algorithms recommending resampling for datasets on the basis of their properties.
Anomaly detection based on one-class classification algorithms is broadly used in many applied domains like image processing (e.g. detection of whether a patient is "cancerous" or "healthy" from mammography image), network intrusion detection, etc. Performance of an anomaly detection algorithm crucially depends on a kernel, used to measure similarity in a feature space. The standard approaches (e.g. cross-validation) for kernel selection, used in two-class classification problems, can not be used directly due to the specific nature of a data (absence of a second, abnormal, class data). In this paper we generalize several kernel selection methods from binary-class case to the case of one-class classification and perform extensive comparison of these approaches using both synthetic and real-world data.
The main aim of this work is to develop and implement an automatic anomaly detection algorithm for meteorological time-series. To achieve this goal we develop an approach to constructing an ensemble of anomaly detectors in combination with adaptive threshold selection based on artificially generated anomalies. We demonstrate the efficiency of the proposed method by integrating the corresponding implementation into "Minimax-94" road weather information system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.