Harry Zhang scite author profile

Abstract. Predictive accuracy has been widely used as the main criterion for comparing the predictive ability of classification systems (such as C4.5, neural networks, and Naive Bayes). Most of these classifiers also produce probability estimations of the classification, but they are completely ignored in the accuracy measure. This is often taken for granted because both training and testing sets only provide class labels. In this paper we establish rigourously that, even in this setting, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, provides a better measure than accuracy. Our result is quite significant for three reasons. First, we establish, for the first time, rigourous criteria for comparing evaluation measures for learning algorithms. Second, it suggests that AUC should replace accuracy when measuring and comparing classification systems. Third, our result also prompts us to re-evaluate many well-established conclusions based on accuracy in machine learning. For example, it is well accepted in the machine learning community that, in terms of predictive accuracy, Naive Bayes and decision trees are very similar. Using AUC, however, we show experimentally that Naive Bayes is significantly better than the decision-tree learning algorithms.

show abstract

Exploring Conditions for the Optimality of Naïve Bayes

Zhang

2005

Int. J. Patt. Recogn. Artif. Intell.

220

127

View full text Add to dashboard Cite

Naïve Bayes is one of the most efficient and effective inductive learning algorithms for machine learning and data mining. Its competitive performance in classification is surprising, because the conditional independence assumption on which it is based is rarely true in real-world applications. An open question is: what is the true reason for the surprisingly good performance of Naïve Bayes in classification? In this paper, we propose a novel explanation for the good classification performance of Naïve Bayes. We show that, essentially, dependence distribution plays a crucial role. Here dependence distribution means how the local dependence of an attribute distributes in each class, evenly or unevenly, and how the local dependences of all attributes work together, consistently (supporting a certain classification) or inconsistently (canceling each other out). Specifically, we show that no matter how strong the dependences among attributes are, Naïve Bayes can still be optimal if the dependences distribute evenly in classes, or if the dependences cancel each other out. We propose and prove a sufficient and necessary condition for the optimality of Naïve Bayes. Further, we investigate the optimality of Naïve Bayes under the Gaussian distribution. We present and prove a sufficient condition for the optimality of Naïve Bayes, in which the dependences among attributes exist. This provides evidence that dependences may cancel each other out. Our theoretic analysis can be used in designing learning algorithms. In fact, a major class of learning algorithms for Bayesian networks are conditional independence-based (or CI-based), which are essentially based on dependence. We design a dependence distribution-based algorithm by extending the ChowLiu algorithm, a widely used CI based algorithm. Our experiments show that the new algorithm outperforms the ChowLiu algorithm, which also provides empirical evidence to support our new explanation.

show abstract

A clustering-based differential evolution for global optimization

Cai

Gong

Ling

et al. 2011

Applied Soft Computing

126

View full text Add to dashboard Cite

Fluoxetine v. placebo in prevention of relapse in post-traumatic stress disorder

Martényi

Brown²,

Zhang³

et al. 2002

Br J Psychiatry

116

View full text Add to dashboard Cite

show abstract

Improving Tree augmented Naive Bayes for class probability estimation

Jiang

Cai

Wang

et al. 2012

Knowledge-Based Systems

114

View full text Add to dashboard Cite

Discriminative parameter learning for Bayesian networks

Zhang

Ling

et al. 2008

View full text Add to dashboard Cite

Bayesian network classifiers have been widely used for classification problems. Given a fixed Bayesian network structure, parameters learning can take two different approaches: generative and discriminative learning. While generative parameter learning is more efficient, discriminative parameter learning is more effective. In this paper, we propose a simple, efficient, and effective discriminative parameter learning method, called Discriminative Frequency Estimate (DFE), which learns parameters by discriminatively computing frequencies from data. Empirical studies show that the DFE algorithm integrates the advantages of both generative and discriminative learning: it performs as well as the state-of-the-art discriminative parameter learning method ELR in accuracy, but is significantly more efficient.

show abstract

Naive Bayesian Classifiers for Ranking

Zhang

2004

View full text Add to dashboard Cite

Abstract. It is well-known that naive Bayes performs surprisingly well in classification, but its probability estimation is poor. In many applications, however, a ranking based on class probabilities is desired. For example, a ranking of customers in terms of the likelihood that they buy one's products is useful in direct marketing. What is the general performance of naive Bayes in ranking? In this paper, we study it by both empirical experiments and theoretical analysis. Our experiments show that naive Bayes outperforms C4.4, the most state-of-the-art decisiontree algorithm for ranking. We study two example problems that have been used in analyzing the performance of naive Bayes in classification [3]. Surprisingly, naive Bayes performs perfectly on them in ranking, even though it does not in classification. Finally, we present and prove a sufficient condition for the optimality of naive Bayes in ranking.

show abstract

Fluoxetine Versus Placebo in Posttraumatic Stress Disorder

Martényi¹,

Brown²,

Zhang³

et al. 2002

J. Clin. Psychiatry

161

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Harry Zhang

AUC: A Better Measure than Accuracy in Comparing Learning Algorithms

Exploring Conditions for the Optimality of Naïve Bayes

A clustering-based differential evolution for global optimization

Fluoxetine v. placebo in prevention of relapse in post-traumatic stress disorder

Improving Tree augmented Naive Bayes for class probability estimation

Discriminative parameter learning for Bayesian networks

Naive Bayesian Classifiers for Ranking

Fluoxetine Versus Placebo in Posttraumatic Stress Disorder

Contact Info

Product

Resources

About