Warren B. Powell scite author profile

We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives' mean values, algorithms may utilize this dependence to perform efficiently even when the number of alternatives is very large. We propose a fully sequential sampling policy called the knowledge-gradient policy, which is provably optimal in some special cases and has bounded suboptimality in all others. We then demonstrate how this policy may be applied to efficiently maximize a continuous function on a continuous domain while constrained to a fixed number of noisy measurements.

show abstract

A Knowledge-Gradient Policy for Sequential Information Collection

Frazier¹,

Powell²,

Dayanık³

2008

SIAM J. Control Optim.

345

268

View full text Add to dashboard Cite

Abstract. In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we refer to as the knowledge-gradient policy. This policy myopically maximizes the expected increment in the value of information in each time period, where the value is measured according to the terminal utility function. We show that the knowledge-gradient policy is optimal both when the horizon is a single time period and in the limit as the horizon extends to infinity. We show furthermore that, in some special cases, the knowledge-gradient policy is optimal regardless of the length of any given fixed total sampling horizon. We bound the knowledge-gradient policy's suboptimality in the remaining cases, and show through simulations that it performs competitively with or significantly better than other policies.

show abstract

Approximate Dynamic Programming

Powell¹

2007

1,324

172

View full text Add to dashboard Cite

Approximate Dynamic Programming

Powell¹

2011

999

155

View full text Add to dashboard Cite

This is my preface. I am going to explain why I wrote this book and who it is for. Chapter 1The challenges of dynamic programmingThe optimization of problems over time arises in many settings, ranging from the control of heating systems to managing entire economies. In between are examples including landing aircraft, purchasing new equipment, managing blood inventories, scheduling fleets of vehicles, selling assets, investing money in portfolios or just playing a game of tic-tac-toe or backgammon. These problems involve making decisions, then observing information, after which we make more decisions, and then more information, and so on. Known as sequential decision problems, they can be straightforward (if subtle) to formulate, but solving them is another matter.Dynamic programming has its roots in several fields. Engineering and economics tend to focus on problems with continuous states and decisions (these communities refer to decisions as controls), while the fields of operations research and artificial intelligence work primarily with discrete states and decisions (or actions). Problems that are modeled with continuous states and decisions (and typically in continuous time) are often addressed under the umbrella of "control theory" whereas problems with discrete states and decisions, modeled in discrete time, are studied at length under the umbrella of "Markov decision processes." Both of these subfields set up recursive equations that depend on the use of a state variable to capture history in a compact way. There are many high-dimensional problems such as those involving the allocation of resources that are generally studied using the tools of mathematical programming. Most of this work focuses on deterministic problems using tools such as linear, nonlinear or integer programming, but there is a subfield known as stochastic programming which incorporates uncertainty. Our presentation spans all of these fields.1 CHAPTER 1. THE CHALLENGES OF DYNAMIC PROGRAMMING

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Warren B. Powell

Handbook of Learning and Approximate Dynamic Programming

The Knowledge-Gradient Policy for Correlated Normal Beliefs

A Knowledge-Gradient Policy for Sequential Information Collection

Approximate Dynamic Programming

Approximate Dynamic Programming

Contact Info

Product

Resources

About