2021
DOI: 10.11591/ijece.v11i1.pp664-670
|View full text |Cite
|
Sign up to set email alerts
|

Text documents clustering using data mining techniques

Abstract: Increasing progress in numerous research fields and information technologies, led to an increase in the publication of research papers. Therefore, researchers take a lot of time to find interesting research papers that are close to their field of specialization. Consequently, in this paper we have proposed documents classification approach that can cluster the text documents of research papers into the meaningful categories in which contain a similar scientific field. Our presented approach based on essential … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 33 publications
(14 citation statements)
references
References 25 publications
0
11
0
Order By: Relevance
“…The second step is to remove stop words that don't make sense. The final step separates the roots of the word lemmatization, allowing the processing of words that appear different but have the same root as a single form [43]. After this stage, the texts are ready for the feature extraction process.…”
Section: Pre-processingmentioning
confidence: 99%
“…The second step is to remove stop words that don't make sense. The final step separates the roots of the word lemmatization, allowing the processing of words that appear different but have the same root as a single form [43]. After this stage, the texts are ready for the feature extraction process.…”
Section: Pre-processingmentioning
confidence: 99%
“…Data mining is the process by which useful information is collected from large amounts of data. Data mining techniques have been used to solve a variety of reallife problems like clustering [1]. In clustering categorizing a population N data points into K subgroups so that data points in one group are more similar to data points in other groups.…”
Section: Introductionmentioning
confidence: 99%
“…Subeno et al [7] aimed to determine the optimal number of corpus topics in the LDA method. The proposed approach in [8] can cluster the text documents of research papers into meaningful categories which contain a similar scientific field using a title, abstract, and keywords of the paper to the categories topics. Chauhan and Shah [9] introduced the preliminaries of the topic modeling techniques and reviewed its extensions and variations.…”
Section: Introductionmentioning
confidence: 99%