Jin Huang scite author profile

The area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been recently proposed as an alternative single-number measure for evaluating the predictive ability of learning algorithms. However, no formal arguments were given as to why AUC should be preferred over accuracy. In this paper, we establish formal criteria for comparing two different measures for learning algorithms, and we show theoretically and empirically that AUC is, in general, a better measure (defined precisely) than accuracy. We then reevaluate well-established claims in machine learning based on accuracy using AUC, and obtain interesting and surprising new results. We also show that AUC is more directly associated with the net profit than accuracy in direct marketing, suggesting that learning algorithms should optimize AUC instead of accuracy in real-world applications.

show abstract

Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks

Huang

et al. 2018

View full text Add to dashboard Cite

AUC: A Better Measure than Accuracy in Comparing Learning Algorithms

2003

View full text Add to dashboard Cite

Abstract. Predictive accuracy has been widely used as the main criterion for comparing the predictive ability of classification systems (such as C4.5, neural networks, and Naive Bayes). Most of these classifiers also produce probability estimations of the classification, but they are completely ignored in the accuracy measure. This is often taken for granted because both training and testing sets only provide class labels. In this paper we establish rigourously that, even in this setting, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, provides a better measure than accuracy. Our result is quite significant for three reasons. First, we establish, for the first time, rigourous criteria for comparing evaluation measures for learning algorithms. Second, it suggests that AUC should replace accuracy when measuring and comparing classification systems. Third, our result also prompts us to re-evaluate many well-established conclusions based on accuracy in machine learning. For example, it is well accepted in the machine learning community that, in terms of predictive accuracy, Naive Bayes and decision trees are very similar. Using AUC, however, we show experimentally that Naive Bayes is significantly better than the decision-tree learning algorithms.

show abstract

Destination prediction by sub-trajectory synthesis and privacy protection against such prediction

et al. 2013

View full text Add to dashboard Cite

Comparing naive Bayes, decision trees, and SVM with AUC and accuracy

Huang

Ling

162

View full text Add to dashboard Cite

Robust Manifold Nonnegative Matrix Factorization

Huang

Nie

Huang

et al. 2014

ACM Trans. Knowl. Discov. Data

177

View full text Add to dashboard Cite

Nonnegative Matrix Factorization (NMF) has been one of the most widely used clustering techniques for exploratory data analysis. However, since each data point enters the objective function with squared residue error, a few outliers with large errors easily dominate the objective function. In this article, we propose a Robust Manifold Nonnegative Matrix Factorization (RMNMF) method using ℓ 2,1 -norm and integrating NMF and spectral clustering under the same clustering framework. We also point out the solution uniqueness issue for the existing NMF methods and propose an additional orthonormal constraint to address this problem. With the new constraint, the conventional auxiliary function approach no longer works. We tackle this difficult optimization problem via a novel Augmented Lagrangian Method (ALM)--based algorithm and convert the original constrained optimization problem on one variable into a multivariate constrained problem. The new objective function then can be decomposed into several subproblems that each has a closed-form solution. More importantly, we reveal the connection of our method with robust K -means and spectral clustering, and we demonstrate its theoretical significance. Extensive experiments have been conducted on nine benchmark datasets, and all empirical results show the effectiveness of our method.

show abstract

Solving the data sparsity problem in destination prediction

Xue

Xie

et al. 2014

The VLDB Journal

View full text Add to dashboard Cite

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the "data sparsity problem", i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis

show abstract

Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation

Huang

Zhang

Zhao

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jin Huang

Using AUC and accuracy in evaluating learning algorithms

Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks

AUC: A Better Measure than Accuracy in Comparing Learning Algorithms

Destination prediction by sub-trajectory synthesis and privacy protection against such prediction

Comparing naive Bayes, decision trees, and SVM with AUC and accuracy

Robust Manifold Nonnegative Matrix Factorization

Solving the data sparsity problem in destination prediction

Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation

Contact Info

Product

Resources

About