2017
DOI: 10.1145/3130348.3130371
|View full text |Cite
|
Sign up to set email alerts
|

Information Retrieval as Statistical Translation

Abstract: We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a statistical model of how a user might distill or "translate" a given document into a query. To assess the relevance of a document to a user's query, we estimate the probability that the query would have been generated as a translation of the document, and factor in the user's general preferences in the form of a prior distribution ove… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
54
0
1

Year Published

2017
2017
2020
2020

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 183 publications
(55 citation statements)
references
References 9 publications
0
54
0
1
Order By: Relevance
“…The language model approach can, however, be extended to incorporate more general notions of relevance. Berger and Lafferty, 1999, show how a language modeling approach based on machine translation provides a basis for handling synonymy and polysemy. Hofmann, 1999, describes how mixture models based on latent classes can represent documents and queries.…”
Section: Language Modelsmentioning
confidence: 99%
See 2 more Smart Citations
“…The language model approach can, however, be extended to incorporate more general notions of relevance. Berger and Lafferty, 1999, show how a language modeling approach based on machine translation provides a basis for handling synonymy and polysemy. Hofmann, 1999, describes how mixture models based on latent classes can represent documents and queries.…”
Section: Language Modelsmentioning
confidence: 99%
“…They describe this combination using a Hidden Markov Model with states that represent a unigram language model (È´Û µ), a bigram language model (È´Û Ò Û Ò ½ µ), and a model of general English (È´Û Ò Ð × µ), and mentions other generation processes such as a synonym model and a topic model. Hofmann, 1999, andBerger andLafferty, 1999 also describe the generation process using mixture models, but with different approaches to representation. Put simply, incorporating a new representation into the language model approach to retrieval involves estimating the language model (probability distribution) for the features of that representation and incorporating that new model into the overall mixture model.…”
Section: Language Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…TRLM has yielded the better performance than the traditional methods, such as VSM, Okapi and LM in the question retrieval. It regarded the question retrieval task as a statistical machine translation problem by using IBM model-1 [8] to learn the word-to-word translation probabilities [1], [3], [9]. Some researchers improved the TRLM by considering the latent topic information or the category information [10]- [14].…”
Section: Introductionmentioning
confidence: 99%
“…the query is thought of as derived from the document in the same sort of way that in speech the heard sounds are generated from a word string (Berger andLafferty 1999, Miller et al 1999).…”
Section: Model Characteristicsmentioning
confidence: 99%