Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval 2010
DOI: 10.1145/1835449.1835505
|View full text |Cite
|
Sign up to set email alerts
|

Estimation of statistical translation models based on mutual information for ad hoc information retrieval

Abstract: As a principled approach to capturing semantic relations of words in information retrieval, statistical translation models have been shown to outperform simple document language models which rely on exact matching of words in the query and documents. A main challenge in applying translation models to ad hoc information retrieval is to estimate a translation model without training data. Existing work has relied on training on synthetic queries generated based on a document collection. However, this method is co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
48
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 66 publications
(49 citation statements)
references
References 37 publications
(43 reference statements)
1
48
0
Order By: Relevance
“…Metzler et al [9] used translation models as a similarity measure for the information flow tracking task. More recently, [2] trained translation model with mutual information of the cooccurent terms, and used it for improving adhoc retrieval. Gao et al [8] extended the translation model to translate between phrases, and used it bridge the vocabulary gap between the Web search query and the page title.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Metzler et al [9] used translation models as a similarity measure for the information flow tracking task. More recently, [2] trained translation model with mutual information of the cooccurent terms, and used it for improving adhoc retrieval. Gao et al [8] extended the translation model to translate between phrases, and used it bridge the vocabulary gap between the Web search query and the page title.…”
Section: Related Workmentioning
confidence: 99%
“…It has been found that the trained translation model usually underestimates the self-translation probability (translating one term to itself), so it usually boosts the self-translation in retrieval [2]. The ranking function can be formulated as:…”
Section: Translation Model For Irmentioning
confidence: 99%
See 2 more Smart Citations
“…The term translation probability P t (w i |w) is dierent from the bigram probability P (w i |w) in that the words w i and w are not limited to occur in order and adjacently in the former. Then, the term association information can be integrated into the unigram class model as follows, P (w i |c) = w∈c P t (w i |w)P (w|c), (11) where P (w|c) reects the distribution of words in the training documents of class c, which can be computed via the maximum likelihood estimate. By replacing P (w i |c) in (10) with the one computed by (11), we have…”
Section: Language Models For Information Retrievalmentioning
confidence: 99%