Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004
DOI: 10.1145/1008992.1009026
|View full text |Cite
|
Sign up to set email alerts
|

Cluster-based retrieval using language models

Abstract: Previous research on cluster-based retrieval has been inconclusive as to whether it does bring improved retrieval effectiveness over document-based retrieval. Recent developments in the language modeling approach to IR have motivated us to re-examine this problem within this new retrieval framework. We propose two new models for cluster-based retrieval and evaluate them on several TREC collections. We show that cluster-based retrieval can perform consistently across collections of realistic size, and significa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
380
2
3

Year Published

2007
2007
2015
2015

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 341 publications
(388 citation statements)
references
References 25 publications
(41 reference statements)
3
380
2
3
Order By: Relevance
“…Finally, many recent improvements in core information retrieval have come from classification and clustering, e.g. viewing document retrieval as a text classification problem (Manning et al, 2008, chapters 11 & 12) or improving retrieval performance using clustering (Liu and Croft, 2004).…”
Section: Data Mining and Machine Learning For Irmentioning
confidence: 99%
“…Finally, many recent improvements in core information retrieval have come from classification and clustering, e.g. viewing document retrieval as a text classification problem (Manning et al, 2008, chapters 11 & 12) or improving retrieval performance using clustering (Liu and Croft, 2004).…”
Section: Data Mining and Machine Learning For Irmentioning
confidence: 99%
“…They use a language modeling framework in which their aspect-x algorithm smoothes documents based on the information from the clusters and the strength of the connection between each document and cluster. Liu and Croft (2004) evaluate both the direct retrieval of clusters and cluster-based smoothing. Their CBDM model is a mixture between a document model, a collection model, and the cluster each document belongs to, which is able to significantly outperform a standard query-likelihood baseline.…”
Section: Related Workmentioning
confidence: 99%
“…The other parameters are investigated in the range [1,10] with increments of 1. We determine the MAP scores on the same topics that we present results for, similar to Liu and Croft (2004), Metzler and Croft (2005), Mitra et al (1998), Lafferty et al (2001) and Zhai and Lafferty (2004). While computationally expensive (exponential in the number of parameters), it does provide us with an upper bound on the retrieval performance that one might achieve using the described models.…”
Section: Parameter Estimationmentioning
confidence: 99%
“…There are several studies that show the benefit of document-side expansion by extracting features from similar documents based on the language models (e.g. [17,20]). However, the particularity of the ad retrieval and the relationship between the landing page and the ads makes our problem significantly different than the setting explored there.…”
Section: Sponsored Search and Content Matchmentioning
confidence: 99%