Nonsmooth Optimization Techniques for Semisupervised Classification

Cluster analysis is an important task in data mining. It deals with the problem of organization of a collection of objects into clusters based on a similarity measure. Various distance functions can be used to define the similarity measure. Cluster analysis problems with the similarity measure defined by the squared Euclidean distance, which is also known as the minimum sum-of-squares clustering, has been studied extensively over the last five decades. However, problems with the L 1 and L 1 norms have attracted less attention. In this chapter, we consider a nonsmooth nonconvex optimization formulation of the cluster analysis problems. This formulation allows one to easily apply similarity measures defined using different distance functions. Moreover, an efficient incremental algorithm can be designed based on this formulation to solve the clustering problems. We develop incremental algorithms for solving clustering problems where the similarity measure is defined using the L 1 ; L 2 and L 1 norms. We also consider different algorithms for solving nonsmooth nonconvex optimization problems in cluster analysis. The proposed algorithms are tested using several real world data sets and compared with other similar algorithms.

show abstract

“…Nonsmooth optimization models of unsupervised and supervised data classification problems are also discussed in [4,5,23].…”

Section: Comparison Of Different Formulations Of Clustering Problemmentioning

confidence: 99%

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Bagirov

Mohebi

2014

Partitional Clustering Algorithms

View full text Add to dashboard Cite

show abstract

“…As proposed there, we can apply a gradient descent based technique, either relying on subgradients [3] or smoothing the objective function first, e.g. by a softmax operation.…”

Section: Optimization By Gradient Descent In the Primalmentioning

confidence: 99%

Partitioning of image datasets using discriminative context information

Lampert

2008

2008 IEEE Conference on Computer Vision and Pattern Recognition

View full text Add to dashboard Cite

We propose a new method to partition an unlabeled dataset, called Discriminative Context Partitioning (DCP). It is motivated by the idea of splitting the dataset based only on how well the resulting parts can be separated from a context class of disjoint data points. This is in contrast to typical clustering techniques like K-means that are based on a generative model by implicitly or explicitly searching for modes in the distribution of samples.The discriminative criterion in DCP avoids the problems that density based methods have when the a priori assumption of multimodality is violated, when the number of samples becomes small in relation to the dimensionality of the feature space, or if the cluster sizes are strongly unbalanced. We formulate DCP's separation property as a large-margin criterion, and show how the resulting optimization problem can be solved efficiently. Experiments on the MNIST and USPS datasets of handwritten digits and on a subset of the Caltech256 dataset show that, given a suitable context, DCP can achieve good results even in situation where density-based clustering techniques fail.

show abstract

“…(Vapnik, 1998) based on their unlabeled sets under the exactly same setting as theirs. Note that the datasets for g50c and g10n were given in Chapelle and Zien (2005) whereas those of Heart and Ionosphere were sampled at random according to Astorino and Fuduli (2005). Table 3 indicates that TSVM DCA outperforms ∇TSVM in all the cases, while outperforming TSVM Bundle in all cases except g50c.…”

Section: Generalization Performancementioning

confidence: 99%

“…Chapelle and Zien (2005) suggested that the cost function of TSVM is appropriate but implementation of TSVM is inadequate. Astorino and Fuduli (2005) also noted that implementation of TSVM is an issue.…”

Section: Introductionmentioning

confidence: 99%

On transductive support vector machines

Shen¹,

Wang²,

Pan³

2007

Prediction and Discovery

View full text Add to dashboard Cite

Transductive support vector machines (TSVM) has been widely used as a means of treating partially labeled data in semisupervised learning. Around it, there has been mystery because of lack of understanding its foundation in generalization. This article aims to clarify several controversial aspects regarding TSVM. Two main results are established. First, TSVM performs no worse than its supervised counterpart SVM when tuning is performed, which is contrary to several studies indicating otherwise. The "alleged" inferior performance of TSVM is mainly because it was not tuned in the process, in addition to the involved minimization routines. Second, we utilize difference convex programming to derive a nonconvex minimization routine for TSVM, which compares favorably against some state-of-the-art methods.This, together with our learning theory lands some support to TSVM.

show abstract

Nonsmooth Optimization Techniques for Semisupervised Classification

Cited by 56 publications

References 23 publications

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Partitioning of image datasets using discriminative context information

On transductive support vector machines

Contact Info

Product

Resources

About