A Review of Different Text Categorization Techniques

Aggarwal, Anubhav; Singh, Jasmeet; Gupta, Kapil

doi:10.14419/ijet.v7i3.8.15210

Cited by 9 publications

(5 citation statements)

References 5 publications

(5 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…And second type of text classification, is called the "multi-label classification" it is considered as a multi-label if there are two or more classes assigned in a document. [8] And today, neural networks and deep learning models create major advances in the natural language processing (NLP) field such as text categorization. HD Wehle (2017) defined deep learning as a form of machine learning that can be either utilized by a supervised or unsupervised learning or both.…”

Section: Neural Network and Deep Learning For Text Categorizationmentioning

confidence: 99%

“…HD Wehle (2017) defined deep learning as a form of machine learning that can be either utilized by a supervised or unsupervised learning or both. [8] Recently, the success of deep learning models in the image classification have attracted considerable attentions to used it in the text classification problem. [9] In 2014 Yoon Kim, used the Convolutional Neural Network (CNN) to classify sentences.…”

Section: Neural Network and Deep Learning For Text Categorizationmentioning

confidence: 99%

See 1 more Smart Citation

Prediction of ISO 9001:2015 Audit Reports According to its Major Clauses using Recurrent Neural Networks

Tarnate¹,

Devaraj²

2019

IJRTE

View full text Add to dashboard Cite

The Quality Assurance Department of the educational sectors is rapidly generating digital documents. The continuous increase of digital documents may become a risk and challenge in the future. Interpreting and analyzing those digital data in a short period of time is very critical and crucial for the top management to support their decisions. By this purpose, this paper explored the possibility of machine learning and data mining process to improve the Quality Assurance Management System process, specifically in the Quality Audit procedures and generation of management reports. The researchers developed a machine learning model that predicts an audit report according to the major clauses of the ISO 9001:2015 Quality Management System (QMS) Requirements. The proposed data mining process helps the top management to identify which principles of the ISO 9001:2015 QMS Requirements they are lacking. The authors used four different Recurrent Neural Networks (RNNs) as a classifier; (1) Long Short Term-Memory (LSTM), (2) Bidirectional-LSTM, (3) Deep-LSTM and a (4) Deep-Bidirectional-LSTM Recurrent Neural Networks with a combine word representation models (word encoding plus an embedding dimension layer). The Deep-Bidirectional-LSTM outperformed the other three RNN models. Where it achieved an average classification accuracy of 91.10%

show abstract

Section: Neural Network and Deep Learning For Text Categorizationmentioning

confidence: 99%

Section: Neural Network and Deep Learning For Text Categorizationmentioning

confidence: 99%

Prediction of ISO 9001:2015 Audit Reports According to its Major Clauses using Recurrent Neural Networks

Tarnate¹,

Devaraj²

2019

IJRTE

View full text Add to dashboard Cite

show abstract

“…Features refined through selection and extraction are fed into classifiers for training and prediction. Traditionally, the most popular classifiers include Naive Bayes, K Nearest Neighbour, Decision Tree, Random Forest, and Support Vector Machine (Aggarwal et al, 2018). Lately, deep-learning-based classifiers have achieved impressive results in TC as they are able to model complex non-linear relationships within data (Kowsari et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

“…A number of review studies have already been carried out. For instance, Aggarwal et al (2018) and Kowsari et al (2019) presented a general overview of TC algorithms; Manikandan and Sivakumar (2018) and Kadhim (2019) conducted surveys on machine-learningbased techniques for TC; Altinel and Ganiz (2018) reviewed the history and development of semantic approaches to TC; Shah and Patel (2016) compared existing methods for feature selection and extraction. However, to our knowledge, no research has been conducted to systematically review TC research with large-scale bibliographic data from a bibliometric perspective.…”

Section: Introductionmentioning

confidence: 99%

The Research Trends of Text Classification Studies (2000–2020): A Bibliometric Analysis

Zhu

Lei

2022

SAGE Open

View full text Add to dashboard Cite

Text Classification (TC) is the process of assigning several different categories to a set of texts. This study aims to evaluate the state of the arts of TC studies. Firstly, TC-related publications indexed in Web of Science were selected as data. In total, 3,121 TC-related publications were published in 760 journals between 2000 and 2020. Then, the bibliographic information was mined to identify the publication trends, important contributors, publication venues, and involved disciplines. Besides, a thematic analysis was performed to extract topics with increasing/decreasing popularity. The findings showed that TC has become a fast-growing interdisciplinary area, and that emerging research powers such as China are playing increasingly important roles in TC research. Moreover, the thematic analysis showed increased interest in topics concerning advanced classification algorithms, performance evaluation methods, and the practical applications of TC. This study will help researchers recognize the recent trends in the area.

show abstract

“…Data classification is one of the most important tasks for different applications, such as text categorization, tone recognition, image classification, microarray gene expression, and protein structure prediction ( Choi et al, 2017 ; Johnson and Zhang, 2017 ; Malhotra et al, 2017 ; Aggarwal et al, 2018 ; Fang et al, 2018 ; Mikołajczyk and Grochowski, 2018 ; Kerkeni et al, 2019 ; Saritas and Yasar, 2019 ; Yildirim et al, 2019 ; Chandrasekar et al, 2020 ). Many types of information (e.g., language, music, and gene) can be represented as sequential data that often contains related information separated by many time steps, and these long-term dependencies are difficult to model as we must retain information from the whole sequence with greater complexity of the model ( Trinh et al, 2018 ; Liu et al, 2019 ; Shewalkar, 2019 ; Yu et al, 2019 ; Zhao et al, 2020 ).…”

Section: Introductionmentioning

confidence: 99%

SS-RNN: A Strengthened Skip Algorithm for Data Classification Based on Recurrent Neural Networks

et al. 2021

View full text Add to dashboard Cite

Recurrent neural networks are widely used in time series prediction and classification. However, they have problems such as insufficient memory ability and difficulty in gradient back propagation. To solve these problems, this paper proposes a new algorithm called SS-RNN, which directly uses multiple historical information to predict the current time information. It can enhance the long-term memory ability. At the same time, for the time direction, it can improve the correlation of states at different moments. To include the historical information, we design two different processing methods for the SS-RNN in continuous and discontinuous ways, respectively. For each method, there are two ways for historical information addition: 1) direct addition and 2) adding weight weighting and function mapping to activation function. It provides six pathways so as to fully and deeply explore the effect and influence of historical information on the RNNs. By comparing the average accuracy of real datasets with long short-term memory, Bi-LSTM, gated recurrent units, and MCNN and calculating the main indexes (Accuracy, Precision, Recall, and F1-score), it can be observed that our method can improve the average accuracy and optimize the structure of the recurrent neural network and effectively solve the problems of exploding and vanishing gradients.

show abstract

A Review of Different Text Categorization Techniques

Cited by 9 publications

References 5 publications

Prediction of ISO 9001:2015 Audit Reports According to its Major Clauses using Recurrent Neural Networks

Prediction of ISO 9001:2015 Audit Reports According to its Major Clauses using Recurrent Neural Networks

The Research Trends of Text Classification Studies (2000–2020): A Bibliometric Analysis

SS-RNN: A Strengthened Skip Algorithm for Data Classification Based on Recurrent Neural Networks

Contact Info

Product

Resources

About