The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)
DOI: 10.1109/wi.2005.138
|View full text |Cite
|
Sign up to set email alerts
|

Standardized Evaluation Method for Web Clustering Results

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0
2

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 6 publications
0
9
0
2
Order By: Relevance
“…Therefore, in this paper we used the Entropy (E) and Purity (P) measures to assess the performance of the algorithms. This is possible since that the groups are known in advance, but this information is used here only for benchmarking purposes, which is a common practice in the literature (Crabtree et al, 2005), (Zhao and Karypis, 2004). Given a cluster S r of size n r , the entropy E(S r ) of this cluster can be measured as follows: …”
Section: Methodsmentioning
confidence: 99%
“…Therefore, in this paper we used the Entropy (E) and Purity (P) measures to assess the performance of the algorithms. This is possible since that the groups are known in advance, but this information is used here only for benchmarking purposes, which is a common practice in the literature (Crabtree et al, 2005), (Zhao and Karypis, 2004). Given a cluster S r of size n r , the entropy E(S r ) of this cluster can be measured as follows: …”
Section: Methodsmentioning
confidence: 99%
“…In this study, nine representative cluster validity indexes (sum of squared error [23], entropy [23,25], relaxation error [26], Davies-Bouldin index, Calinski-Harabasz index, Silhouette statistic, Dunn index, SD validity index, and S_Bbw validity index) were used to evaluate the clustering results. These cluster validity indexes are used to evaluate clustering results generally.…”
Section: Cluster Validity Indexmentioning
confidence: 99%
“…In Crabtree et al (2005) it is suggested that evaluation methodologies can be split into two categories: internal quality, based on objective functions specific to the algorithm, and external quality, which evaluates the output clustering. External quality assessment can be further divided into gold-standard, task-oriented and user evaluation.…”
Section: Measuring Cluster Qualitymentioning
confidence: 99%
“…The size of the dataset was chosen as this was the maximum number of results available from the default search engine API (Yahoo), as well as considering the inefficiency of downloading all the results for processing. This paper includes the results of TCA, FTCA, STC and Lingo on four queries used in other papers (Xiao and Hung 2008; Crabtree et al 2005; Janruang and Kreesuradej 2006) (Jaguar, Apple, Java, Salsa), using the Yahoo! search API (except for the Jaguar dataset, which is taken from Xiao and Hung 2008), as well as results from ODP.…”
Section: Measuring Cluster Qualitymentioning
confidence: 99%