Mining Optimal Policies: A Pattern Recognition Approach to Model Analysis

Bravo, Fernanda; Shaposhnik, Yaron

doi:10.1287/ijoo.2019.0026

Cited by 23 publications

(8 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Extract simple heuristic policies from DRL outputs: Instead of leveraging analytical results to improve the numerical performance of DRL, the numerical policies developed by DRL may also spur the development of new, simpler heuristic policies that can be analytically characterized, and are thus easier to implement. A novel approach to convert numerical results into analytic insights is proposed by Bravo & Shaposhnik (2020) . They leverage exact numerical methods to find the optimal value functions for a range of MDPs, and subsequently use the output as input to a machine learning method to extract analytic insights into the structure of the optimal policy for a range of problem domains: inventory management, queuing admission control, multi-armed bandits, and revenue management.…”

Section: Blending Numerical and Analytical Approaches To Optimize Inventory Policiesmentioning

confidence: 99%

“…This sharply contrasts with the often highly intuitive character of policies obtained via analytic methods, such as for instance base-stock or constant order policies. In addition to developing more intuitive policies from the complex output of neural networks similar to Bravo & Shaposhnik (2020) , we may also develop models that additionally explain why an action is proposed. A vast field exists on explaining and interpreting the output of AI models (see also Gilpin et al, 2018 ).…”

Section: Blending Numerical and Analytical Approaches To Optimize Inventory Policiesmentioning

confidence: 99%

See 1 more Smart Citation

Deep Reinforcement Learning for Inventory Control: A Roadmap

et al. 2021

View full text Add to dashboard Cite

Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control. Yet, the abundance of choices that come with designing a DRL algorithm, combined with the intense computational effort to tune and evaluate each choice, may hamper their application in practice. This paper describes the key design choices of DRL algorithms to facilitate their implementation in inventory control. We also shed light on possible future research avenues that may elevate the current state-of-the-art of DRL applications for inventory control and broaden their scope by leveraging and improving on the structural policy insights within inventory research. Our discussion and roadmap may also spur future research in other domains within operations management.

show abstract

Section: Blending Numerical and Analytical Approaches To Optimize Inventory Policiesmentioning

confidence: 99%

Section: Blending Numerical and Analytical Approaches To Optimize Inventory Policiesmentioning

confidence: 99%

Deep Reinforcement Learning for Inventory Control: A Roadmap

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Recently, there has been some works toward developing methods for interpretable policies in sequential decision-making(not necessarily specific to the healthcare setting). Bravo and Shaposhnik (2020) propose to explain the optimal unconstrained policies with decision trees, applying their framework to classical operations problems such as queuing control and multi-armed bandit (MAB). However, this may be misleading, as there is no guarantee that the novel explainable, suboptimal policies have the same performance as the unconstrained, optimal policies (Rudin 2019).…”

Section: Related Literaturementioning

confidence: 99%

Interpretable Machine Learning for Resource Allocation with Application to Ventilator Triage

Grand-Clément¹,

Chan²,

Goyal³

et al. 2021

Preprint

View full text Add to dashboard Cite

Rationing of healthcare resources is a challenging decision that policy makers and providers may be forced to make during a pandemic, natural disaster, or mass casualty event. Well-defined guidelines to triage scarce life-saving resources must be designed to promote transparency, trust and consistency. To facilitate buy-in and use during high stress situations, these guidelines need to be interpretable and operational. We propose a novel data-driven model to compute interpretable triage guidelines based on policies for Markov Decision Process that can be represented as simple sequences of decision trees (tree policies). In particular, we characterize the properties of optimal tree policies and present an algorithm based on dynamic programming recursions to compute good tree policies. We utilize this methodology to obtain simple, novel triage guidelines for ventilator allocations for COVID-19 patients, based on real patient data from Montefiore hospitals. We also compare the performance of our guidelines to the official New York State guidelines that were developed in 2015 (well before the COVID-19 pandemic). Our empirical study shows that the number of excess deaths associated with ventilator shortages could be reduced significantly using our policy. Our work highlights the limitations of the existing official triage guidelines, which need to be adapted specifically to COVID-19 before being successfully deployed.

show abstract

“…In recent years there has been an emerging interest in combining ideas from machine learning with operations research to develop a framework that uses data to prescribe optimal decisions (Bertsimas and Kallus, 2019;Den Hertog and Postek, 2016;Bravo and Shaposhnik, 2018). Current research focus has been on applying machine learning methodologies to predict the counterfactuals, based on which optimal decisions can be made.…”

Section: The Problem and Related Workmentioning

confidence: 99%

Distributionally Robust Learning

Chen

Paschalidis

2020

FNT in Optimization

View full text Add to dashboard Cite

This monograph develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. Beginning with fundamental properties of the Wasserstein metric and the DRO formulation, we explore duality to arrive at tractable formulations and develop finite-sample, as well as asymptotic, performance guarantees. We consider a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification, (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning, and (vi) distributionally robust reinforcement learning. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining bounds on the prediction and estimation errors of the solution. Beyond theory, we include numerical experiments and case studies using synthetic and real data. The real data experiments are all associated with various health informatics problems, an application area which provided the initial impetus for this work.

show abstract

Mining Optimal Policies: A Pattern Recognition Approach to Model Analysis

Cited by 23 publications

References 42 publications

Deep Reinforcement Learning for Inventory Control: A Roadmap

Deep Reinforcement Learning for Inventory Control: A Roadmap

Interpretable Machine Learning for Resource Allocation with Application to Ventilator Triage

Distributionally Robust Learning

Contact Info

Product

Resources

About