2020
DOI: 10.1609/aaai.v34i04.5797
|View full text |Cite
|
Sign up to set email alerts
|

Distributionally Robust Counterfactual Risk Minimization

Abstract: This manuscript introduces the idea of using Distributionally Robust Optimization (DRO) for the Counterfactual Risk Minimization (CRM) problem. Tapping into a rich existing literature, we show that DRO is a principled tool for counterfactual decision making. We also show that well-established solutions to the CRM problem like sample variance penalization schemes are special instances of a more general DRO problem. In this unifying framework, a variety of distributionally robust counterfactual risk estimators c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
2

Relationship

2
8

Authors

Journals

citations
Cited by 26 publications
(31 citation statements)
references
References 14 publications
0
31
0
Order By: Relevance
“…Calafiore [34] also studied a distributionally robust portfolio selection problem in which its KL-divergence-based ambiguity of the return distribution is constructed around a discrete nominal distribution. Chen et al [37] applied the DRO model over KL-divergence-based ambiguity set to the unit commitment problem and Faury et al [63] illustrated how the KL DRO can serve as a principle tool for the counterfactual risk minimization problem. [87] considered the complexity of the new KL-divergence-based DRO approach for general problem with format (12).…”
Section: Kullback-leibler(kl)mentioning
confidence: 99%
“…Calafiore [34] also studied a distributionally robust portfolio selection problem in which its KL-divergence-based ambiguity of the return distribution is constructed around a discrete nominal distribution. Chen et al [37] applied the DRO model over KL-divergence-based ambiguity set to the unit commitment problem and Faury et al [63] illustrated how the KL DRO can serve as a principle tool for the counterfactual risk minimization problem. [87] considered the complexity of the new KL-divergence-based DRO approach for general problem with format (12).…”
Section: Kullback-leibler(kl)mentioning
confidence: 99%
“…Interestingly, adversarial training can also be understood as solving DRO with a Wasserstein metric-based set [Staib and Jegelka, 2017]. Applications of DRO have been explored in contextual bandits for policy learning [Si et al, 2020, Mo et al, 2020, Faury et al, 2020 and evaluation [Kato et al, 2020, Jeong and. Uncertainty sets based on KL-divergence, L 1 , L 2 and L ∞ norms have been studied [Nilim andEl Ghaoui, 2005, Iyengar, 2005] in robust RL.…”
Section: Related Workmentioning
confidence: 99%
“…Any sample z ∼ γ is now accepted with probability P a (z) = w ϕ (z)/m. Interestingly, by actively capping the importance weights as it is done in counterfactual estimation [5,8], one controls the acceptance rates P a (z) of the rejection sampling algorithm:…”
Section: Sampling From the Latent Importance Weightsmentioning
confidence: 99%