This article focusses on the problems of feature extraction and the recognition of handwritten digits. A trainable feature extractor based on the LeNet5 convolutional neural network architecture is introduced to solve the first problem in a black box scheme without prior knowledge on the data. The classification task is performed by Support Vector Machines to enhance the generalization ability of LeNet5. In order to increase the recognition rate, new training samples are generated by affine transformations and elastic distortions. Experiments are performed on the well known MNIST database to validate the method and the results show that the system can outperfom both SVMs and LeNet5 while providing performances comparable to the best performance on this database. Moreover, an analysis of the errors is conducted to discuss possible means of enhancement and their limitations.
We propose a new framework for hybrid system identification, which relies on continuous optimization. This framework is based on the minimization of a cost function that can be chosen as either the minimum or the product of loss functions. The former is inspired by traditional estimation methods, while the latter is inspired by recent algebraic and support vector regression approaches to hybrid system identification. In both cases, the identification problem is recast as a continuous optimization program involving only the real parameters of the model as variables, thus avoiding the use of discrete optimization. This program can be solved efficiently by using standard optimization methods even for very large data sets. In addition, the proposed framework easily incorporates robustness to different kinds of outliers through the choice of the loss function.
This paper studies automatic segmentation of multiple motions from tracked feature points through spectral embedding and clustering of linear subspaces. We show that the dimension of the ambient space is crucial for separability, and that low dimensions chosen in prior work are not optimal. We suggest lower and upper bounds together with a data-driven procedure for choosing the optimal ambient dimension. Application of our approach to the Hopkins155 video benchmark database uniformly outperforms a range of state-of-the-art methods both in terms of segmentation accuracy and computational speed.
This paper explores the incorporation of prior knowledge in support vector regresion by the addition of constraints. Equality and inequality constraints are studied with the corresponding types of prior knowledge that can be considered for the method. These include particular points with known values, prior knowledge on any derivative of the function either provided by a prior model or available only at some specific points and bounds on the function or any derivative in a given domain. Moreover, a new method for the simultaneous approximation of multiple outputs linked by some prior knowledge is proposed. This method also allows consideration of different types of prior knowledge on single outputs while training on multiple outputs. Synthetic examples show that incorporating a wide variety of prior knowledge becomes easy, as it leads to linear programs, and helps to improve the approximation in difficult cases. The benefits of the method are finally shown on a reallife application, the estimation of in-cylinder residual gas fraction in spark ignition engines, which is representative of numerous situations met in engineering.
For classification, support vector machines (SVMs) have recently been introduced and quickly became the state of the art. Now, the incorporation of prior knowledge into SVMs is the key element that allows to increase the performance in many applications. This paper gives a review of the current state of research regarding the incorporation of two general types of prior knowledge into SVMs for classification. The particular forms of prior knowledge considered here are presented in two main groups: class-invariance and knowledge on the data. The first one includes invariances to transformations, to permutations and in domains of input space, whereas the second one contains knowledge on unlabeled data, the imbalance of the training set or the quality of the data. The methods are then described and classified in the three categories that have been used in literature: sample methods based on the modification of the training data, kernel methods based on the modification of the kernel and optimization methods based on the modification of the problem formulation. A recent method, developed for support vector regression, considers prior knowledge on arbitrary regions of the input space. It is exposed here when applied to the classification case. A discussion is then conducted to regroup sample and optimization methods under a regularization framework.
Abstract. Hybrid system identification aims at both estimating the discrete state or mode for each data point, and the submodel governing the dynamics of the continuous state for each mode. The paper proposes a new method based on kernel regression and Support Vector Machines (SVM) to tackle this problem. The resulting algorithm is able to compute both the discrete state and the submodels in a single step, independently of the discrete state sequence that generated the data. In addition to previous works, nonlinear submodels are also considered, thus extending the class of systems on which the method can be applied from PieceWise Affine (PWA) and switched linear to PieceWise Smooth (PWS) and switched nonlinear systems with unknown nonlinearities. Piecewise systems with nonlinear boundaries between the modes are also considered with some preliminary results on this issue.
The paper provides results regarding the computational complexity of hybrid system identification. More precisely, we focus on the estimation of piecewise affine (PWA) maps from input-output data and analyze the complexity of computing a global minimizer of the error. Previous work showed that a global solution could be obtained for continuous PWA maps with a worst-case complexity exponential in the number of data. In this paper, we show how global optimality can be reached for a slightly more general class of possibly discontinuous PWA maps with a complexity only polynomial in the number of data, however with an exponential complexity with respect to the data dimension. This result is obtained via an analysis of the intrinsic classification subproblem of associating the data points to the different modes. In addition, we prove that the problem is NP-hard, and thus that the exponential complexity in the dimension is a natural expectation for any exact algorithm.
International audienceThis paper deals with the switched linear regression problem inherent in hybrid system identification. In particular, we discuss k-LinReg, a straightforward and easy to implement algorithm in the spirit of k-means for the nonconvex optimization problem at the core of switched linear regression, and focus on the question of its accuracy on large data sets and its ability to reach global optimality. To this end, we emphasize the relationship between the sample size and the probability of obtaining a local minimum close to the global one with a random initialization. This is achieved through the estimation of a model of the behavior of this probability with respect to the problem dimensions. This model can then be used to tune the number of restarts required to obtain a global solution with high probability. Experiments show that the model can accurately predict the probability of success and that, despite its simplicity, the resulting algorithm can outperform more complicated approaches in both speed and accuracy
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.