Large Scale Kernel Methods for Online AUC Maximization

Ding, Yi; Liu, Chenghao; Zhao, Peilin; Hoi, Steven C. H.

doi:10.1109/icdm.2017.18

Cited by 22 publications

(29 citation statements)

References 30 publications

(40 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to these, we implement two mini-batch stochastic gradient algorithms for large scale CTR prediction problems: MB-PHL is a mini-batch gradient descent algorithm which uses PHL. A variant of this approach is also proposed in the recent work of [12]. MB-PSL is another mini-batch gradient method that uses PSL.…”

Section: Methodsmentioning

confidence: 99%

“…Later, [8] use the pairwise squared loss function, which eliminates the need for buffering previous instances; [9] propose adaptive gradient/subgradient methods which can also handle sparse inputs, while [10], [11] consider the nonlinear AUC maximization problem using kernel and multiple-kernel methods. Most recently, [12] focuses on scalable kernel methods.…”

Section: Introductionmentioning

confidence: 99%

“…scalable minibatch algorithms using U-statistics have not been developed. The nearest work of [12] uses mini-batch techniques, but for gradient descent.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

MBA: Mini-Batch AUC Optimization

Gultekin

Saha

Ratnaparkhi³

et al. 2020

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Area under the receiver operating characteristics curve (AUC) is an important metric for a wide range of signal processing and machine learning problems, and scalable methods for optimizing AUC have recently been proposed. However, handling very large datasets remains an open challenge for this problem. This paper proposes a novel approach to AUC maximization, based on sampling mini-batches of positive/negative instance pairs and computing U-statistics to approximate a global risk minimization problem. The resulting algorithm is simple, fast, and learning-rate free. We show that the number of samples required for good performance is independent of the number of pairs available, which is a quadratic function of the positive and negative instances. Extensive experiments show the practical utility of the proposed method.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

MBA: Mini-Batch AUC Optimization

Gultekin

Saha

Ratnaparkhi³

et al. 2020

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…For massive datasets, however, the size of the budget needs to be large to reduce the variance of the model and to achieve an acceptable accuracy, which in turns increases the training time complexity. The work [8] attempts to address the scalability problem of kernelized online AUC maximization by learning a mini-batch linear classifier on an embedded feature space. The authors explore both Nyström approximation and random Fourier features to construct an embedding in an online setting.…”

Section: Related Workmentioning

confidence: 99%

Scalable Nonlinear AUC Maximization Methods

Khalid

Ray

Chitsaz

2019

Machine Learning and Knowledge Discovery in Databases

View full text Add to dashboard Cite

The area under the ROC curve (AUC) is a widely used measure for evaluating classification performance on heavily imbalanced data. The kernelized AUC maximization machines have established a superior generalization ability compared to linear AUC machines because of their capability in modeling the complex nonlinear structures underlying most real-world data. However, the high training complexity renders the kernelized AUC machines infeasible for large-scale data. In this paper, we present two nonlinear AUC maximization algorithms that optimize linear classifiers over a finite-dimensional feature space constructed via the kmeans Nyström approximation. Our first algorithm maximizes the AUC metric by optimizing a pairwise squared hinge loss function using the truncated Newton method. However, the second-order batch AUC maximization method becomes expensive to optimize for extremely massive datasets. This motivates us to develop a first-order stochastic AUC maximization algorithm that incorporates a scheduled regularization update and scheduled averaging to accelerate the convergence of the classifier. Experiments on several benchmark datasets demonstrate that the proposed AUC classifiers are more efficient than kernelized AUC machines while they are able to surpass or at least match the AUC performance of the kernelized AUC machines. We also show experimentally that the proposed stochastic AUC classifier is able to reach the optimal solution, while the other state-of-the-art online and stochastic AUC maximization methods are prone to suboptimal convergence.

show abstract

“…AUC optimization has been extensively studied over the past decades [6], [26], [27], [28]. Most of the algorithms were designed for learning classifiers for classification problems.…”

Section: Auc Optimizationmentioning

confidence: 99%

Cross-Modal Metric Learning for AUC Optimization

Huo

Shi

Yin

2018

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Cross-modal metric learning (CML) deals with learning distance functions for cross-modal data matching. The existing methods mostly focus on minimizing a loss defined on sample pairs. However, the numbers of intraclass and interclass sample pairs can be highly imbalanced in many applications, and this can lead to deteriorating or unsatisfactory performances. The area under the receiver operating characteristic curve (AUC) is a more meaningful performance measure for the imbalanced distribution problem. To tackle the problem as well as to make samples from different modalities directly comparable, a CML method is presented by directly maximizing AUC. The method can be further extended to focus on optimizing partial AUC (pAUC), which is the AUC between two specific false positive rates (FPRs). This is particularly useful in certain applications where only the performances assessed within predefined false positive ranges are critical. The proposed method is formulated as a log-determinant regularized semidefinite optimization problem. For efficient optimization, a minibatch proximal point algorithm is developed. The algorithm is experimentally verified stable with the size of sampled pairs that form a minibatch at each iteration. Several data sets have been used in evaluation, including three cross-modal data sets on face recognition under various scenarios and a single modal data set, the Labeled Faces in the Wild. Results demonstrate the effectiveness of the proposed methods and marked improvements over the existing methods. Specifically, pAUC-optimized CML proves to be more competitive for performance measures such as Rank-1 and verification rate at FPR = 0.1%.

show abstract

Large Scale Kernel Methods for Online AUC Maximization

Cited by 22 publications

References 30 publications

MBA: Mini-Batch AUC Optimization

MBA: Mini-Batch AUC Optimization

Scalable Nonlinear AUC Maximization Methods

Cross-Modal Metric Learning for AUC Optimization

Contact Info

Product

Resources

About