Abstract:As the size of the document collections are increasing day-by-day, finding an essential document clusters for classification problem is one of the major problem due to high inter and intra document variations. Also, most of the conventional classification models such as SVM, neural network and Bayesian models have high true negative rate and error rate for document classification process. In order to improve the computational efficacy of the traditional document classification models, a hybrid feature extracti… Show more
“…With [13][14][15] emphasizing the role of mobile network operators in managing IoT communication data and spotlighting the challenges in new media system analysis, the overarching theme becomes evident. There's an urgent need for a robust model that navigates the intricacies of device communication data and provides tangible insights for device functionality enhancement and security fortification [16][17][18][19][20]. This backdrop amplifies the motivation behind the proposed work, which aims to leverage the K-means clustering algorithm to decode intricate communication patterns from modern electronic devices [21][22][23].…”
From smart home devices to wearable devices, electronics have become an indispensable part of modern life. Vast volumes of data have been collected by these electronic devices, revealing precise information about device communications, user behaviours, and more. Improvements to device features, insights into the user experience, and the detection of security risks are just some of the many uses for this information. However, advanced analytical methods are required to make sense of this plethora of data successfully. The K-means clustering algorithm is used in the present research to analyse the data sent and received by different types of electronics. The first step of the research is collecting data, intending to create a representative sample of people using various devices and communication methods. After collecting data, preprocessing is necessary to ensure it can be analysed successfully. In the next step, the K-means algorithm classifies the information into subsets that stand for distinct modes of interaction. The primary objective of the research is to gain an improved understanding of these groups by demonstrating how users communicate, device communication, and possibilities for enhancing functionality and security.
“…With [13][14][15] emphasizing the role of mobile network operators in managing IoT communication data and spotlighting the challenges in new media system analysis, the overarching theme becomes evident. There's an urgent need for a robust model that navigates the intricacies of device communication data and provides tangible insights for device functionality enhancement and security fortification [16][17][18][19][20]. This backdrop amplifies the motivation behind the proposed work, which aims to leverage the K-means clustering algorithm to decode intricate communication patterns from modern electronic devices [21][22][23].…”
From smart home devices to wearable devices, electronics have become an indispensable part of modern life. Vast volumes of data have been collected by these electronic devices, revealing precise information about device communications, user behaviours, and more. Improvements to device features, insights into the user experience, and the detection of security risks are just some of the many uses for this information. However, advanced analytical methods are required to make sense of this plethora of data successfully. The K-means clustering algorithm is used in the present research to analyse the data sent and received by different types of electronics. The first step of the research is collecting data, intending to create a representative sample of people using various devices and communication methods. After collecting data, preprocessing is necessary to ensure it can be analysed successfully. In the next step, the K-means algorithm classifies the information into subsets that stand for distinct modes of interaction. The primary objective of the research is to gain an improved understanding of these groups by demonstrating how users communicate, device communication, and possibilities for enhancing functionality and security.
“…The results of their experiments revealed that the CPAMF results were better than those of the cosine measure and BM25 by a healthy margin. For research article categorization, some authors have proposed hybrid approaches [19], [27], [28], [29], [30]. In these approaches, feature extraction is performed utilizing DL techniques and classification based on ML and DL methods.…”
Section: Literature Reviewmentioning
confidence: 99%
“…These techniques can recognize the context of words in a research article, such as semantic and grammatical similarities, as well as correlations with other words. Owing to the increasing use of these techniques by researchers in different domains, the document classification community started the utilization of these techniques in their studies [14], [18], [19], [20] which presented promising results. One of the issues related to these techniques is the large length of the vector generated against a single word in a text.…”
We extend our heartfelt gratitude and appreciation to Qatar National Library for their generous support in providing Open Access funding for this research.
“…Comparison against the performance with SVM, Rocchio algorithm, Bayes, Naïve Bayes is mentioned in the paper, however, authors have not provided the table or graph results. Some authors proposed hybrid approaches for textual document classification [21] [22] [23]. In hybrid approaches, the algorithms focused on both feature extraction using deep learning and classification using machine and deep learning.…”
The existing plethora of document classification techniques exploits different data sources either from the content or metadata of research articles. Various journal publishers like Springer, Elsevier, IEEE, etc., do not provide open access to the content of research articles, whereas metadata is freely available there. Metadata like title, keyword, and abstract can serve as a better alternative to the content in various scenarios. In the current literature, researchers have assessed the role of some of the metadata individually. We believe that the collective contribution of metadata parameters can play a significant role in classifying research papers. This paper presents a comprehensive evaluation of the role of metadata, individually as well as in combinations to achieve the objective of research paper classification. Moreover, we have classified the research articles into ACM hierarchy root categories (e.g. general literature, hardware, software, etc.). In this comprehensive evaluation, we have assessed all the possible combinations of metadata features against different classifiers such as Random Forest, K Nearest Neighbor, and Decision Tree. The results of this research reveal that the title keywords combination outperforms other combinations with an F-measure score of 0.88.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.