A Frequent Concepts Based Document Clustering Algorithm

Baghel, Rekha; Dhir, Renu

doi:10.5120/826-1171

Cited by 30 publications

(20 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In [11], a new technique based on frequent concepts for document clustering is proposed. Frequent Concepts based Document Clustering (FCDC) algorithm utilizes the semantic relationship between words, explored using WordNet ontology, to create concepts.…”

Section: Review Of Semantic Driven Document Clustering Methodsmentioning

confidence: 99%

“…But, taking into account synonyms and hypernyms, disambiguated only by PoS tags, is not successful in improving clustering effectiveness because of the noise produced by all the incorrect senses extracted from WordNet. A possible solution is proposed which uses a word-by-word disambiguation in order to choose the correct sense of a word in [11]. In [6] Clustering based on Frequent Word Sequences (CFWS) has been proposed.…”

Section: Overview Of Clustering Algorithmsmentioning

confidence: 99%

See 1 more Smart Citation

Semantic based Document Clustering: A Detailed Review

Shah¹,

Mahajan²

2012

IJCA

View full text Add to dashboard Cite

Section: Review Of Semantic Driven Document Clustering Methodsmentioning

confidence: 99%

Section: Overview Of Clustering Algorithmsmentioning

confidence: 99%

Semantic based Document Clustering: A Detailed Review

Shah¹,

Mahajan²

2012

IJCA

View full text Add to dashboard Cite

“…The experiments showed that using the semantic WN concepts features were promising and outperformed the baseline BoW model. Also, Baghel and Dhir in [18] proposed a hierarchy clustering algorithm to cluster the documents based on the concepts representation. The concepts were extracted from WN using the FstC WSD strategy.…”

Section: Related Workmentioning

confidence: 99%

Semantic Sentiment Analysis of Arabic Texts

Alowaidi¹,

Saleh²,

Abulnaja³

2017

ijacsa

View full text Add to dashboard Cite

“…A good stemmer should be able to convert different syntactic forms of a word into its normalized form, reduce the number of index terms, save memory and storage and may increase the performance of clustering algorithms to some extent; meanwhile it should try stemming. Porter Stemmer [27] is a widely applied method to stem documents. It is compact, simple and relatively accurate.…”

Section: A Document Preprocessing Stagementioning

confidence: 99%

Clustering Web Documents based on Efficient Multi-Tire Hashing Algorithm for Mining Frequent Termsets

Negm¹,

Elkafrawy²,

Amin³

et al. 2013

IJARAI

View full text Add to dashboard Cite

(MTHFT) instead of Apriori algorithm. The algorithm uses new methodology for generating frequent termsets by building the multi-tire hash table during the scanning process of documents only one time. To avoid hash collision, Multi Tire technique is utilized in this proposed hashing algorithm. Based on the generated frequent termset the documents are partitioned and the clustering occurs by grouping the partitions through the descriptive keywords. By using MTHFT algorithm, the scanning cost and computational cost is improved moreover the performance is considerably increased and increase up the clustering process. The CWDHFT approach improved accuracy, scalability and efficiency when compared with existing clustering algorithms like Bisecting K-means and FIHC.

show abstract

A Frequent Concepts Based Document Clustering Algorithm

Cited by 30 publications

References 18 publications

Semantic based Document Clustering: A Detailed Review

Semantic based Document Clustering: A Detailed Review

Semantic Sentiment Analysis of Arabic Texts

Clustering Web Documents based on Efficient Multi-Tire Hashing Algorithm for Mining Frequent Termsets

Contact Info

Product

Resources

About