The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2020
DOI: 10.14569/ijacsa.2020.0110748
|View full text |Cite
|
Sign up to set email alerts
|

A Hybrid Document Features Extraction with Clustering based Classification Framework on Large Document Sets

Abstract: As the size of the document collections are increasing day-by-day, finding an essential document clusters for classification problem is one of the major problem due to high inter and intra document variations. Also, most of the conventional classification models such as SVM, neural network and Bayesian models have high true negative rate and error rate for document classification process. In order to improve the computational efficacy of the traditional document classification models, a hybrid feature extracti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 30 publications
(30 reference statements)
0
4
0
Order By: Relevance
“…With [13][14][15] emphasizing the role of mobile network operators in managing IoT communication data and spotlighting the challenges in new media system analysis, the overarching theme becomes evident. There's an urgent need for a robust model that navigates the intricacies of device communication data and provides tangible insights for device functionality enhancement and security fortification [16][17][18][19][20]. This backdrop amplifies the motivation behind the proposed work, which aims to leverage the K-means clustering algorithm to decode intricate communication patterns from modern electronic devices [21][22][23].…”
Section: Literature Reviewmentioning
confidence: 99%
“…With [13][14][15] emphasizing the role of mobile network operators in managing IoT communication data and spotlighting the challenges in new media system analysis, the overarching theme becomes evident. There's an urgent need for a robust model that navigates the intricacies of device communication data and provides tangible insights for device functionality enhancement and security fortification [16][17][18][19][20]. This backdrop amplifies the motivation behind the proposed work, which aims to leverage the K-means clustering algorithm to decode intricate communication patterns from modern electronic devices [21][22][23].…”
Section: Literature Reviewmentioning
confidence: 99%
“…The results of their experiments revealed that the CPAMF results were better than those of the cosine measure and BM25 by a healthy margin. For research article categorization, some authors have proposed hybrid approaches [19], [27], [28], [29], [30]. In these approaches, feature extraction is performed utilizing DL techniques and classification based on ML and DL methods.…”
Section: Literature Reviewmentioning
confidence: 99%
“…These techniques can recognize the context of words in a research article, such as semantic and grammatical similarities, as well as correlations with other words. Owing to the increasing use of these techniques by researchers in different domains, the document classification community started the utilization of these techniques in their studies [14], [18], [19], [20] which presented promising results. One of the issues related to these techniques is the large length of the vector generated against a single word in a text.…”
Section: Introductionmentioning
confidence: 99%
“…Comparison against the performance with SVM, Rocchio algorithm, Bayes, Naïve Bayes is mentioned in the paper, however, authors have not provided the table or graph results. Some authors proposed hybrid approaches for textual document classification [21] [22] [23]. In hybrid approaches, the algorithms focused on both feature extraction using deep learning and classification using machine and deep learning.…”
Section: Related Workmentioning
confidence: 99%