2023
DOI: 10.48550/arxiv.2302.03784
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Leveraging User-Triggered Supervision in Contextual Bandits

Abstract: We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context. Such an interaction arises, for example, in text prediction or autocompletion settings, where a poor suggestion is simply ignored and the user enters the desired text instead. Crucially, this extra feedback is usertriggered on only a subset of the contexts. We develop a new framework to leverage such signals, while being robust to their biased nature. We also augment standard CB algorithms to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 23 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?