Asymptotically efficient adaptive allocation rules

Lai, T. L.; Robbins, Herbert

doi:10.1016/0196-8858(85)90002-8

Cited by 2,102 publications

(2,080 citation statements)

References 2 publications

Supporting

Mentioning

1,996

Contrasting

Unclassified

Order By: Relevance

“…Also with uncensored demand data, Chang et al [? ] propose an adaptive algorithm using results from multi-armed bandit problems (see Lai and Robbins [23] and Auer et al [1] for more details). Another approach with uncensored demand data is the bootstrap method, as shown in Bookbinder and Lordahl [3], to estimate the fractile of the demand distribution.…”

Section: Literature Review and Our Contributions Classical Inventory mentioning

confidence: 99%

A Nonparametric Asymptotic Analysis of Inventory Planning with Censored Demand

2009

View full text Add to dashboard Cite

We study stochastic inventory planning with lost sales and instantaneous replenishment, where contrary to the classical inventory theory, the knowledge of the demand distribution is not available. Furthermore, we observe only the sales quantity in each period, and lost sales are unobservable, that is, demand data are censored. The manager must make an ordering decision in each period based only on historical sales data. Excess inventory is either perishable or carried over to the next period. In this setting, we propose non-parametric adaptive policies that generate ordering decisions over time. We show that the T -period average expected cost of our policy differs from the benchmark newsvendor cost -the minimum expected cost that would have incurred if the manager had known the underlying demand distribution -by at most O(1/ √ T ). IntroductionThe problem of inventory control and planning has received much interest from practitioners and academics from the early years of operations research. The early literature in this area modeled demand as deterministic and having known quantities, but it soon became apparent that deterministic modeling was often inadequate, and uncertainty needed to be incorporated in modeling future demand. As a result, a majority of the papers on inventory theory during the past fifty years employ stochastic demand models. In these models, future demand is given by a specific exogenous random variable, and the inventory decisions are made with full knowledge of the future demand distribution. In many applications, however, the demand distribution is not known a priori. Even when past data have been collected, the selection of the most appropriate distribution and its parameters remains ambiguous. In the case when excess demand is lost, the information available to the inventory manager is further limited since she does not observe the realized demand but only observes the sales quantity (often referred to as censored demand), which is the smaller of the stocking level and the realized demand. Motivated by these realistic constraints, we develop a non-parametric approach to stochastic inventory planning in the presence of lost sales and censored demand.

show abstract

Section: Literature Review and Our Contributions Classical Inventory mentioning

confidence: 99%

A Nonparametric Asymptotic Analysis of Inventory Planning with Censored Demand

2009

View full text Add to dashboard Cite

show abstract

“…We did not consider random nodes here, but they could easily be included as well. We do not write explicitly a proof of the consistency of these algorithms, but we guess that the proof is a consequence of properties in [5,8,2,1]. We'll see the choice of constants below.…”

Section: End While If the Root Is In 1p Or 2p Thenmentioning

confidence: 99%

Upper Confidence Trees with Short Term Partial Information

Teytaud

Flory²

2011

Applications of Evolutionary Computation

View full text Add to dashboard Cite

show abstract

“…Lai and Robbins [2] showed that any asymptotically efficient algorithm for the multi-armed bandit problem must choose suboptimal arms for an expected number of times that is at least logarithmic in time. That is,…”

Section: A Bound On Optimal Performancementioning

confidence: 99%

“…Literature review: The multi-armed bandit problem has been extensively studied; a survey is presented in [1]. In their seminal work, Lai and Robbins [2] established a logarithmic lower bound on the expected number of times a sub-optimal arm needs to be selected by an optimal policy. This research has been supported in part by ONR grant N00014-09-1-1074 and ARO grant W911NG-11-1-0385.…”

Section: Introductionmentioning

confidence: 99%

“…Since [2], a considerable emphasis has been on the design of simple heuristic policies that achieve the logarithmic lower bound on the expected number of selection instances of any suboptimal arm. To this end, Auer et al [3] developed upper confidence bound (UCB) algorithms for multiarmed bandits with bounded reward that achieve logarithmic expected cumulative regret uniformly in time.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On optimal foraging and multi-armed bandits

Srivastava

Reverdy

Leonard

2013

2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton)

View full text Add to dashboard Cite

Abstract-We consider two variants of the standard multiarmed bandit problem, namely, the multi-armed bandit problem with transition costs and the multi-armed bandit problem on graphs. We develop block allocation algorithms for these problems that achieve an expected cumulative regret that is uniformly dominated by a logarithmic function of time, and an expected cumulative number of transitions from one arm to another arm uniformly dominated by a double-logarithmic function of time. We observe that the multi-armed bandit problem with transition costs and the associated block allocation algorithm capture the key features of popular animal foraging models in literature.

show abstract

Asymptotically efficient adaptive allocation rules

Cited by 2,102 publications

References 2 publications

A Nonparametric Asymptotic Analysis of Inventory Planning with Censored Demand

A Nonparametric Asymptotic Analysis of Inventory Planning with Censored Demand

Upper Confidence Trees with Short Term Partial Information

On optimal foraging and multi-armed bandits

Contact Info

Product

Resources

About