2011
DOI: 10.5121/acij.2011.2615
|View full text |Cite
|
Sign up to set email alerts
|

Empirical Studies On Machine Learning Based Text Classification Algorithms

Abstract: Automatic classification of text documents has become an important research issue now days. Proper classification of text documents requires information retrieval, machine learning and Natural language processing (NLP) techniques. Our aim is to focus on important approaches to automatic text classification based on machine learning techniques viz. supervised, unsupervised and semi supervised. In this paper we present a review of various text classification approaches under machine learning paradigm. We expect … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(13 citation statements)
references
References 17 publications
0
12
0
1
Order By: Relevance
“…We used these five different classifiers since, in machine learning, there is no particular rule as to which classifier will perform best for a given feature set. These classifiers are few of the prominent machine learning classifiers (Dharmadhikari et al ., 2011) and were chosen based on their diverse nature of classification. The merit and demerit of each classifier are presented in Table 1.…”
Section: Methodsmentioning
confidence: 99%
“…We used these five different classifiers since, in machine learning, there is no particular rule as to which classifier will perform best for a given feature set. These classifiers are few of the prominent machine learning classifiers (Dharmadhikari et al ., 2011) and were chosen based on their diverse nature of classification. The merit and demerit of each classifier are presented in Table 1.…”
Section: Methodsmentioning
confidence: 99%
“…The data was collected from several Arabian scientific encyclopedia in many fields. The accuracy was 91% and 93% for literary and scientific corpus, respectively [9].…”
Section: Related Workmentioning
confidence: 95%
“…There are two methods utilized in TC: machine learning in which the text can be classified by using a set of training documents, and rule-based TC which allows the usage of experts, or engineer's knowledge to classify the text [18]. Furthermore, the TC can be used in several applications of computer science such as spam or e-mail filtering, or as an accessible tool for interesting information in particular documents [4], [9].…”
Section: Text Classificationmentioning
confidence: 99%
“…Due to the surge in the size of data for the past two decades, automation process is required to achieve the goals of information extraction and classification/clustering of data for a variety of purposes. Those include email filtering and routing; news observing; Spam filtering and search engines [20]; newsgroups classification, and survey data grouping [17]. Depending on the nature of the available data, machine learning can be classified to three main categories [10] [21].…”
Section: B Machine Learning Techniquesmentioning
confidence: 99%