Johan A. K. Suykens scite author profile

N k=1 α k y k = 0 0 ≤ α k ≤ c, k = 1, ..., N. Note: w and ϕ(x k) are not calculated. • Mercer condition: K(x k , x l) = ϕ(x k) T ϕ(x l) • Obtained classifier: y(x) = sign[ N k=1 α k y k K(x, x k) + b] with α k positive real constants, b real constant, that follow as solution to the QP problem. Non-zero α k are called support values and the corresponding data points are called support vectors. The bias term b follows from KKT conditions. • Some possible kernels K(•, •): K(x, x k) = x T k x (linear SVM) K(x, x k) = (x T k x + 1) d (polynomial SVM of degree d) K(x, x k) = exp{− x − x k 2 2 /σ 2 } (RBF SVM) K(x, x k) = tanh(κ x T k x + θ) (MLP SVM) • In the case of RBF and MLP kernel, the number of hidden units corresponds to the number of support vectors.

show abstract

Least Squares Support Vector Machines

Suykens¹,

Gestel²,

Brabanter³

et al. 2002

1,471

1,378

View full text Add to dashboard Cite

Untitled

1999

View full text Add to dashboard Cite

show abstract

Weighted least squares support vector machines: robustness and sparse approximation

et al. 2002

View full text Add to dashboard Cite

Least squares support vector machines (LS-SVM) is an SVM version which involves equality instead of inequality constraints and works with a least squares cost function. In this way, the solution follows from a linear Karush-Kuhn-Tucker system instead of a quadratic programming problem. However, sparseness is lost in the LS-SVM case and the estimation of the support values is only optimal in the case of a Gaussian distribution of the error variables. In this paper, we discuss a method which can overcome these two drawbacks. We show how to obtain robust estimates for regression by applying a weighted version of LS-SVM. We also discuss a sparse approximation procedure for weighted and unweighted LS-SVM. It is basically a pruning method which is able to do pruning based upon the physical meaning of the sorted support values, while pruning procedures for classical multilayer perceptrons require the computation of a Hessian matrix or its inverse. The methods of this paper are illustrated for RBF kernels and demonstrate how to obtain robust estimates with selection of an appropriate number of hidden units, in the case of outliers or non-Gaussian error distributions with heavy tails.

show abstract

Benchmarking state-of-the-art classification algorithms for credit scoring

Baesens

Gestel

Viaene

et al. 2003

Journal of the Operational Research Society

698

447

View full text Add to dashboard Cite

In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.

show abstract

Benchmarking Least Squares Support Vector Machine Classifiers

et al. 2004

View full text Add to dashboard Cite

Abstract. In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.

show abstract

Optimal control by least squares support vector machines

2001

View full text Add to dashboard Cite

Application of a Smoothing Technique to Decomposition in Convex Optimization

Necoara

Suykens

2008

IEEE Trans. Automat. Contr.

153

227

View full text Add to dashboard Cite

Dual decomposition is a powerful technique for deriving decomposition schemes for convex optimization problems with separable structure. Although the Augmented Lagrangian is computationally more stable than the ordinary Lagrangian, the prox-term destroys the separability of the given problem. In this paper we use another approach to obtain a smooth Lagrangian, based on a smoothing technique developed by Nesterov, which preserves separability of the problem. With this approach we derive a new decomposition method, called proximal center algorithm, which from the viewpoint of efficiency estimates improves the bounds on the number of iterations of the classical dual gradient scheme by an order of magnitude. This can be achieved with the new decomposition algorithm since the resulting dual function has good smoothness properties and since we make use of the particular structure of the given problem.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Johan A. K. Suykens

Least Squares Support Vector Machines

Least Squares Support Vector Machines

Untitled

Weighted least squares support vector machines: robustness and sparse approximation

Benchmarking state-of-the-art classification algorithms for credit scoring

Benchmarking Least Squares Support Vector Machine Classifiers

Optimal control by least squares support vector machines

Application of a Smoothing Technique to Decomposition in Convex Optimization

Contact Info

Product

Resources

About