The 41st International ACM SIGIR Conference on Research &Amp; Development in Information Retrieval 2018
DOI: 10.1145/3209978.3209986
|View full text |Cite
|
Sign up to set email alerts
|

Unbiased Learning to Rank with Unbiased Propensity Estimation

Abstract: Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework based on inverse propensity weighting. Despite their differences, most existing studies separate the estimation of click bias (namely the propensity model) from the learning of ranking algorithms. To estimate click propensities, they either conduct online result ra… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
237
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 151 publications
(245 citation statements)
references
References 40 publications
1
237
0
Order By: Relevance
“…Thus, clicks on positions that are observed less often due to position bias will have greater weight to account for that difference. However, the position bias must be learned and estimated somewhat accurately [1]. On the other side of the spectrum are click models, which attempt to model user behavior completely [4].…”
Section: Learning To Rank From Historical Interactionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, clicks on positions that are observed less often due to position bias will have greater weight to account for that difference. However, the position bias must be learned and estimated somewhat accurately [1]. On the other side of the spectrum are click models, which attempt to model user behavior completely [4].…”
Section: Learning To Rank From Historical Interactionsmentioning
confidence: 99%
“…Furthermore, it is unclear Algorithm 1 Dueling Bandit Gradient Descent (DBGD). 1: Input: initial weights: θ1; unit: u; learning rate η.…”
Section: Online Learning To Rankmentioning
confidence: 99%
“…Counterfactual Learning to Rank (CLTR) [1,2,16] aims to learn a ranking model offline from historical interaction data. Employing an offline approach has many benefits compared to an online one.…”
Section: Counterfactual Learning To Rankmentioning
confidence: 99%
“…Table 2 provides the click probabilities for three different click behavior models: Perfect click behavior has probabilities proportional to the relevance and never clicks on a non-relevant document, simulating an ideal user. Binarized click behavior acts on only two levels of relevance and is affected by position-bias; this simulated behavior has been used in previous work on CLTR [1,2,16]. And Near-Random behavior clicks very often, and only slightly more frequently on more relevant documents than on less relevant documents; this behavior simulates very high levels of click noise.…”
Section: Simulating User Behaviormentioning
confidence: 99%
“…First, both sources are typically limited in availability and are often proprietary company resources. Second, click-stream data is typically biased towards the first few elements in the ranking presented to the user [2] and are noisy in general. Finally, such logs are only available after the fact, leading to a cold start problem.…”
mentioning
confidence: 99%