2015
DOI: 10.5897/sre2014.6107
|View full text |Cite
|
Sign up to set email alerts
|

An efficient hybrid distributed document clustering algorithm

Abstract: Recent advances in information technology have led to an increase in volumes of data thereby exceeding beyond petabytes. Clustering distributed document sets from a central location is difficult due to the massive demand of computational resources. So there is a need for distributed document clustering algorithms to cluster documents using distributed resources. The greatest challenge in this area of distributed document clustering is the clustering quality and speedup associated with increase in document sets… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…The vector space model (VSM) is a widely practiced model for document preprocessing [22]. Particle Swarm Optimization (PSO) and Latent Semantic Indexing based optimization of MRK-Means are implemented in [23] on the Reuters-21578 dataset after the dataset is preprocessed using the VSM. The PSO is used for selecting the best centroids and dimensionality reduction.…”
Section: Related Workmentioning
confidence: 99%
“…The vector space model (VSM) is a widely practiced model for document preprocessing [22]. Particle Swarm Optimization (PSO) and Latent Semantic Indexing based optimization of MRK-Means are implemented in [23] on the Reuters-21578 dataset after the dataset is preprocessed using the VSM. The PSO is used for selecting the best centroids and dimensionality reduction.…”
Section: Related Workmentioning
confidence: 99%