2011
DOI: 10.1007/978-3-642-23982-3_25
|View full text |Cite
|
Sign up to set email alerts
|

A Technique for Improving the Performance of Naive Bayes Text Classification

Abstract: Abstract. Naive Bayes classifier is widely used in text classification tasks, and it can perform surprisingly well, it is often regarded as a baseline. But previous researches show that the skewed distribution of training collection may cause poor results in text classification. This paper presents a new method to deal with this situation. We introduce a conditional probability which takes into account both the information of the whole corpus and each category. Our proposed method performs well in the standard… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…In terms of NB, Zhu et al [ 12 ] used the NB algorithm for text classification. Jiang et al [ 41 ] proposed an improved NB technology for text classification performance. This method solves the problem of unsatisfactory results caused by the uneven distribution of training data.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In terms of NB, Zhu et al [ 12 ] used the NB algorithm for text classification. Jiang et al [ 41 ] proposed an improved NB technology for text classification performance. This method solves the problem of unsatisfactory results caused by the uneven distribution of training data.…”
Section: Literature Reviewmentioning
confidence: 99%
“…An objective of the experiment is that the better the results obtained using the same categorization method-the better the representation is. Very popular approach to text categorization is the Naive Bayes classifier, e.g., Jiang et al (2011). We use this widely studied approach to perform classification of the text within 10 test data packages.…”
Section: Evaluation Of Combined Representationmentioning
confidence: 99%
“…Up to now, different algorithms have been used to diagnose various diseases. In the present study, the researchers proposed a new model for the diagnosis of heart disease based on the Farmland Fertility Algorithm (FFA) [12] and NB [13]. It used the BFFA for FS and the NB algorithm for samples classification.…”
Section: Introductionmentioning
confidence: 99%