Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002
DOI: 10.1145/564376.564386
|View full text |Cite
|
Sign up to set email alerts
|

Title language model for information retrieval

Abstract: In this paper, we propose a new language model, namely, a title language model, for information retrieval. Different from the traditional language model used for retrieval, we define the conditional probability P(Q|D) as the probability of using query Q as the title for document D. We adopted the statistical translation model learned from the title and document pairs in the collection to compute the probability P(Q|D). To avoid the sparse data problem, we propose two new smoothing methods. In the experiments w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
35
0

Year Published

2005
2005
2018
2018

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 69 publications
(37 citation statements)
references
References 9 publications
2
35
0
Order By: Relevance
“…Empirical studies have shown that anchor texts exhibit characteristics similar to both user queries and document titles [24]. Language models generated from document titles also can be used as an approximation of a user query language model [30]. Anchor text has been widely used in the IR field to improve search effectiveness [22,23,25,31,35,36,38].…”
Section: Query Setmentioning
confidence: 99%
“…Empirical studies have shown that anchor texts exhibit characteristics similar to both user queries and document titles [24]. Language models generated from document titles also can be used as an approximation of a user query language model [30]. Anchor text has been widely used in the IR field to improve search effectiveness [22,23,25,31,35,36,38].…”
Section: Query Setmentioning
confidence: 99%
“…This is the basic formulation of the HMM model proposed by Miller et al and often referred to as the simple language model which has been used as the baseline language model in several studies (Lavrenko & Croft, 2001;Liu & Croft, 2002;Jin et al, 2002). Retrieval experiments on TREC test collections show that the simple two-state system can do dramatically better than the tf-idf measure.…”
Section: Miller Et Al (1999) Use a Two State Hidden Markov Model (Hmmentioning
confidence: 99%
“…Building upon the ideas of Berger & Lafferty (1999), Jin et al (2002) propose to construct language models of document titles and determine the relevance a document to a query by estimating the likelihood that the query would have been the title for the document. The title of a document is viewed as a translation from that document and the title language model is regarded as an approximate language model of the query.…”
Section: Miller Et Al (1999) Use a Two State Hidden Markov Model (Hmmentioning
confidence: 99%
See 1 more Smart Citation
“…Ponte and Croft originally proposed LM for IR [10], then Song put emphasis on data smoothing techniques in LM [12]. Recently, many variations of traditional LM have been developed to improve IR performance, such as relevance-based language model [13], time-based language model [14] and title language model [15]. In this paper, we extend the traditional document LM to the author LM and the category LM according to the nature of BBS articles.…”
Section: Related Workmentioning
confidence: 99%