2015
DOI: 10.1016/j.eswa.2014.11.003
|View full text |Cite
|
Sign up to set email alerts
|

A hybrid evolutionary computation approach with its application for optimizing text document clustering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 47 publications
(22 citation statements)
references
References 31 publications
0
15
0
Order By: Relevance
“…In common practice, we set the number of terms to be constant among topics, so above equation can be simplified as: s t ð Þ ¼ N=n, where n is the number of unique terms in any topic t. Then, if s t ð Þ is greater than a user defined threshold, t i and t j are merged. Compared to extant topic modelling methods (Song, Qiao, Park, & Qian, 2015;Wang, Mao, Wang, & Guo, 2017) that utilise Gaussian-Poisson distribution to approximate the document-topic, and topic-word distributions, or by optimising a log-linear model (Li, Duan, et al, 2017), our method restricts the overlap between topics, and are more internally consistent. Based on the above method, the top 5 topics in FLSs are reported in Table 2.…”
Section: Feature Engineering From Flssmentioning
confidence: 99%
“…In common practice, we set the number of terms to be constant among topics, so above equation can be simplified as: s t ð Þ ¼ N=n, where n is the number of unique terms in any topic t. Then, if s t ð Þ is greater than a user defined threshold, t i and t j are merged. Compared to extant topic modelling methods (Song, Qiao, Park, & Qian, 2015;Wang, Mao, Wang, & Guo, 2017) that utilise Gaussian-Poisson distribution to approximate the document-topic, and topic-word distributions, or by optimising a log-linear model (Li, Duan, et al, 2017), our method restricts the overlap between topics, and are more internally consistent. Based on the above method, the top 5 topics in FLSs are reported in Table 2.…”
Section: Feature Engineering From Flssmentioning
confidence: 99%
“…The data on this set is considered particularly noisy and as might be expected does include complications such as duplicate entries and cross postings. We construct a 500 document subset of the 20 Newsgroup dataset in the same way as Song et al [21] by randomly taking 100 documents from five categories (comp.os.ms-windows.misc, misc.forsale, rec.sport.hockey, sci.space, soc.religion.christian).…”
Section: ) 20 Newsgroup (Name: 20ng5)mentioning
confidence: 99%
“…Genetic algorithms have been used in text clustering [2]. For example, Song et al [21] have used a GA in combination with techniques based on swarm intelligence to optimise text clustering. In this case GA is used to find an optimal set of centres for text clusters.…”
Section: Introductionmentioning
confidence: 99%
“…In some literatures, additional information is introduced for text clustering such as side-information [40] and privileged information [41]. What is more, several global optimization algorithms are utilized for text clustering such as particle swarm optimization (PSO) algorithm [42,43] and bee colony optimization (BCO) algorithm [44,45].…”
Section: Clustering Algorithmmentioning
confidence: 99%