2020
DOI: 10.26555/jifo.v14i2.a17513
|View full text |Cite
|
Sign up to set email alerts
|

State of the art document clustering algorithms based on semantic similarity

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0
3

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 35 publications
(65 reference statements)
0
6
0
3
Order By: Relevance
“…Another close problem is to find clusters of related news articles such that a cluster has a high internal coherence ( i.e ., having articles from the same topic or word distribution), but different from other clusters ( Bisandu, Prasad & Liman, 2018 ; Salih & Jacksi, 2020 ; Khan et al, 2018 ). While articles within the cluster might be candidate background to each other, the cluster size is big, and selecting specific articles from the cluster to present to the reader of a specific query article is again a challenging problem.…”
Section: Background Linking Problemmentioning
confidence: 99%
“…Another close problem is to find clusters of related news articles such that a cluster has a high internal coherence ( i.e ., having articles from the same topic or word distribution), but different from other clusters ( Bisandu, Prasad & Liman, 2018 ; Salih & Jacksi, 2020 ; Khan et al, 2018 ). While articles within the cluster might be candidate background to each other, the cluster size is big, and selecting specific articles from the cluster to present to the reader of a specific query article is again a challenging problem.…”
Section: Background Linking Problemmentioning
confidence: 99%
“…The effectiveness of word embeddings in projecting keyword relationships as well as their performance in topic modeling [32,33], clustering [34], and document classification [35,36] consistently inspire researchers to propose methodologies that summarize word vectors into detailed structures reflected on textual topics [37,38]. In general, the effect, evaluation, and selection of the different techniques in cluster analysis and topic extraction vary in relevant experiments [39,40]. Some of the standard methodologies that utilize word vectors for topic extraction are the Gaussian Mixture Models (GMM) [32,41], Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [13,42], hard and Fuzzy K-Means [34,43], and more complex approaches that are based on textual similarities [40].…”
Section: Cluster Analysis and Word Embeddingsmentioning
confidence: 99%
“…In general, the effect, evaluation, and selection of the different techniques in cluster analysis and topic extraction vary in relevant experiments [39,40]. Some of the standard methodologies that utilize word vectors for topic extraction are the Gaussian Mixture Models (GMM) [32,41], Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [13,42], hard and Fuzzy K-Means [34,43], and more complex approaches that are based on textual similarities [40].…”
Section: Cluster Analysis and Word Embeddingsmentioning
confidence: 99%
“…This process is carried out on a group of things that have been gathered together [11][12][13]. Clustering is a technique that can be used to organize data structures into a number of groups that are incompatible with one another and are referred to collectively as clusters [14][15][16]. Clustering is a strategy that may be employed.…”
Section: Introductionmentioning
confidence: 99%