2022
DOI: 10.1609/aaai.v36i9.21249
|View full text |Cite
|
Sign up to set email alerts
|

Towards Robust Off-Policy Learning for Runtime Uncertainty

Abstract: Off-policy learning plays a pivotal role in optimizing and evaluating policies prior to the online deployment. However, during the real-time serving, we observe varieties of interventions and constraints that cause inconsistency between the online and offline setting, which we summarize and term as runtime uncertainty. Such uncertainty cannot be learned from the logged data due to its abnormality and rareness nature. To assert a certain level of robustness, we perturb the off-policy estimators along an adversa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 22 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?