In this paper we study the problem of finding a support of unknown high-dimensional distributions in the presence of labeling information, called Supervised Novelty Detection (SND). The One-Class Support Vector Machine (SVM) is a widely used kernel-based technique to address this problem. However with the latter approach it is difficult to model a mixture of distributions from which the support might be constituted. We address this issue by presenting a new class of SVM-like algorithms which help to approach multi-class classification and novelty detection from a new perspective. We introduce a new coupling term between classes which leverages the problem of finding a good decision boundary while preserving the compactness of a support with the l2-norm penalty. First we present our optimization objective in the primal and then derive a dual QP formulation of the problem. Next we propose a Least-Squares formulation which results in a linear system which drastically reduces computational costs. Finally we derive a Pegasos-based formulation which can effectively cope with large data sets that cannot be handled by many existing QP solvers. We complete our paper with experiments that validate the usefulness and practical importance of the proposed methods both in classification and novelty detection settings.
Pegasos has become a widely acknowledged algorithm for learning linear Support Vector Machines. It utilizes properties of hinge loss and theory of strongly convex optimization problems for fast convergence rates and lower computational and memory costs. In this paper we adopt the recently proposed pinball loss for the Pegasos algorithm and show some advantages of using it in a variety of classification problems. First we present the newly derived Pegasos optimization objective with respect to pinball loss and analyze its properties and convergence rates. Additionally we present extensions of the Pegasos algorithm applied to the kernel-induced and Nyström approximated feature maps which introduce nonlinearity in the input space. This is done using a Fixed-Size kernel method approach. Second we give experimental results for publicly available UCI datasets to justify the advantages and the importance of pinball loss for achieving a better classification accuracy and greater numerical stability in the partially or fully stochastic setting. Finally we conclude our paper with a brief discussion of the applicability of pinball loss to real-life problems.
Sentence-level classification and sequential labeling are two fundamental tasks in language understanding. While these two tasks are usually modeled separately, in reality, they are often correlated, for example in intent classification and slot filling, or in topic classification and named-entity recognition. In order to utilize the potential benefits from their correlations, we propose a jointly trained model for learning the two tasks simultaneously via Long Short-Term Memory (LSTM) networks. This model predicts the sentence-level category and the word-level label sequence from the stepwise output hidden representations of LSTM. We also introduce a novel mechanism of "sparse at-tention" to weigh words differently based on their semantic relevance to sentence-level classification. The proposed method outperforms baseline models on ATIS and TREC datasets.
This paper presents some essential findings and results on using ranking-based kernels for the analysis and utilization of high dimensional and noisy biomedical data in applied clinical diagnostics. We claim that presented kernels combined with a state-of-the-art classification technique - a Support Vector Machine (SVM) - could significantly improve the classification rate and predictive power of the wrapper method, e.g. SVM. Moreover, the advantage of such kernels could be potentially exploited for other kernel methods and essential computer-aided tasks such as novelty detection and clustering. Our experimental results and theoretical generalization bounds imply that ranking-based kernels outperform other traditionally employed SVM kernels on high dimensional biomedical and microarray data.
Abstract-In this paper we present a novel approach and a new machine learning problem, called Supervised Novelty Detection (SND). This problem extends the One-Class Support Vector Machine setting for binary classification while keeping the nice properties of novelty detection problem at hand. To tackle this we approach binary classification from a new perspective using two different estimators and a coupled regularization term. It involves optimization over a different objective and a doubled set of Lagrange multipliers. One might consider our approach as a joint estimation of the support for different probability distributions per class where an ultimate goal is to separate classes with the largest possible angle between the normal vectors to the decision hyperplanes in the feature space. Regarding an obvious novelty of our problem we report and compare the results along the lines of standard C-SVM, LS-SVM and One-Class SVM. Experiments have demonstrated promising results that validate the usefulness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.