1985
DOI: 10.1016/0196-8858(85)90002-8
|View full text |Cite
|
Sign up to set email alerts
|

Asymptotically efficient adaptive allocation rules

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

19
1,996
5
14

Year Published

2008
2008
2013
2013

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 2,102 publications
(2,080 citation statements)
references
References 2 publications
19
1,996
5
14
Order By: Relevance
“…Also with uncensored demand data, Chang et al [? ] propose an adaptive algorithm using results from multi-armed bandit problems (see Lai and Robbins [23] and Auer et al [1] for more details). Another approach with uncensored demand data is the bootstrap method, as shown in Bookbinder and Lordahl [3], to estimate the fractile of the demand distribution.…”
Section: Literature Review and Our Contributions Classical Inventory mentioning
confidence: 99%
“…Also with uncensored demand data, Chang et al [? ] propose an adaptive algorithm using results from multi-armed bandit problems (see Lai and Robbins [23] and Auer et al [1] for more details). Another approach with uncensored demand data is the bootstrap method, as shown in Bookbinder and Lordahl [3], to estimate the fractile of the demand distribution.…”
Section: Literature Review and Our Contributions Classical Inventory mentioning
confidence: 99%
“…We did not consider random nodes here, but they could easily be included as well. We do not write explicitly a proof of the consistency of these algorithms, but we guess that the proof is a consequence of properties in [5,8,2,1]. We'll see the choice of constants below.…”
Section: End While If the Root Is In 1p Or 2p Thenmentioning
confidence: 99%
“…Lai and Robbins [2] showed that any asymptotically efficient algorithm for the multi-armed bandit problem must choose suboptimal arms for an expected number of times that is at least logarithmic in time. That is,…”
Section: A Bound On Optimal Performancementioning
confidence: 99%
“…Literature review: The multi-armed bandit problem has been extensively studied; a survey is presented in [1]. In their seminal work, Lai and Robbins [2] established a logarithmic lower bound on the expected number of times a sub-optimal arm needs to be selected by an optimal policy. This research has been supported in part by ONR grant N00014-09-1-1074 and ARO grant W911NG-11-1-0385.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation