A Review of Machine Learning Algorithms for Text-Documents Classification

“…Machine Learning techniques, Data Mining and Natural Language Processing (NLP) work in combination to automatically identify patterns from the electronic documents to help classify them in intended categories (ALmomani et al, 2012). Naïve bayes classifier was found to be most effective in real world complex scenarios due to simple initial conditions required by the model (Baharudin et al, 2010). Naïve Bayes classifiers can be trained in an efficient manner.…”

Section: Related Workmentioning

confidence: 99%

A Novel Email Response Algorithm for Email Management Systems

Al-Alwani¹

2014

Journal of Computer Science

View full text Add to dashboard Cite

Email has been one of the most commonly used tool for communication in the recent years and email management has evolved as a major challenge due to prevailing situation of online email congestion. This study presents a novel algorithm for automatic email response methodology in an Email Management System to minimize email overload. The proposed model uses Bayes classifier to categorize emails into classes and generate suitable replies to these classes using information extraction and template filling. Our research aims to intelligently automate email response using Naïve Bayesian classification and formulate probabilistic dictionaries for accurate information extraction. This research will help in reducing email overload and unavoidable congestion by employing a novel email response architecture for an email management systems.

show abstract

“…The formation of basic words with the steaming process in the document in Indonesia still has constraints where not all words can be truncated properly. This study uses the Zamief Nasri algorithm [16] which has been picketed in a sastrawi PHP library. Table 5 shows the list of Indonesian affixes.…”

Section: Stemmingmentioning

confidence: 99%

Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm

Subhan

Sudarsono

Barakbah

2018

emitter

View full text Add to dashboard Cite

Radical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SARA in Indonesian languange. SARA content is very difficult to detect due to the large number, unstructure system and many noise can be caused multiple interpretations. This problem can threat the unity and harmony of the religion. According to this condition, it is required a system that can distinguish the radical content or not. In this system, we propose text mining approach using DF threshold and Human Brain as the feature extraction. The system is divided into several steps, those are collecting data which is including at preprocessing part, text mining, selection features, classification for grouping the data with class label, simillarity calculation of data training, and visualization to the radical content or non radical content. The experimental result show that using combination from 10-cross validation and k-Nearest Neighbor (kNN) as the classification methods achieve 66.37% accuracy performance with 7 k value of kNN method [1].

show abstract

A Review of Machine Learning Algorithms for Text-Documents Classification

Cited by 259 publications

References 82 publications

Topic Categorization Based on Collectives of TermWeighting Methods for Natural Language Call Routing

Topic Categorization Based on Collectives of TermWeighting Methods for Natural Language Call Routing

A Novel Email Response Algorithm for Email Management Systems

Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm

Contact Info

Product

Resources

About