Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002
DOI: 10.1145/564376.564412
|View full text |Cite
|
Sign up to set email alerts
|

Document clustering with committees

Abstract: Document clustering is useful in many information retrieval tasks: document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of documents, etc. The general goal of clustering is to group data elements such that the intra-group similarities are high and the inter-group similarities are low. We present a clustering algorithm called CBC (Clustering By Committee) that is shown to produce higher quality clusters in document clustering tasks as compared to several well kn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
51
0
2

Year Published

2005
2005
2014
2014

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 117 publications
(54 citation statements)
references
References 10 publications
1
51
0
2
Order By: Relevance
“…Pantel (2005) created semantic vectors for each word in WordNet by disambiguating contexts which appeared with different senses of a word. The building of semantic vectors is described in (Pantel 2003). WordNet's hierarchy was used to propagate contexts where words may appear throughout the network.…”
Section: Updating Wordnetmentioning
confidence: 99%
“…Pantel (2005) created semantic vectors for each word in WordNet by disambiguating contexts which appeared with different senses of a word. The building of semantic vectors is described in (Pantel 2003). WordNet's hierarchy was used to propagate contexts where words may appear throughout the network.…”
Section: Updating Wordnetmentioning
confidence: 99%
“…Once the frequency vectors are collected, the pointwise mutual information vectors are generated since experiments showed that they produce much higher quality clusters than by using the term frequency vectors C(e) [10].…”
Section: There Is a List Of Standard Dependency Relations Listed In [mentioning
confidence: 99%
“…Mutual information vector MI(e) = (mi e1 , mi e2 , …, mi em ) for each element e is generated, where mi ef is the pointwise mutual information between element e and feature f, which is defined (1) as [10]:…”
Section: Stage 3: Generating the Pointwise Mutual Information Vectorsmentioning
confidence: 99%
See 2 more Smart Citations