1988
DOI: 10.1017/s0021900200040420
|View full text |Cite
|
Sign up to set email alerts
|

Restless bandits: activity allocation in a changing world

Abstract: We consider a population of n projects which in general continue to evolve whether in operation or not (although by different rules). It is desired to choose the projects in operation at each instant of time so as to maximise the expected rate of reward, under a constraint upon the expected number of projects in operation. The Lagrange multiplier associated with this constraint defines an index which reduces to the Gittins index when projects not being operated are static. If one is constrained to operate m pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
1,021
0
2

Year Published

1991
1991
2016
2016

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 425 publications
(1,026 citation statements)
references
References 1 publication
3
1,021
0
2
Order By: Relevance
“…Gittins derived this result as a by-product of his ground-breaking results on the multi-armed bandit problem. The literature of multiarmed bandit related papers that build on Gittins's result is huge, see, e.g., [3,[20][21][22][23]. However, Gittins's optimality result in the context of the M/G/1 queue has not been fully exploited, and it has not received the attention it deserves.…”
Section: Introductionmentioning
confidence: 99%
“…Gittins derived this result as a by-product of his ground-breaking results on the multi-armed bandit problem. The literature of multiarmed bandit related papers that build on Gittins's result is huge, see, e.g., [3,[20][21][22][23]. However, Gittins's optimality result in the context of the M/G/1 queue has not been fully exploited, and it has not received the attention it deserves.…”
Section: Introductionmentioning
confidence: 99%
“…Building upon the seminal work of Whittle (1988), Dusonchet (2003) and Niño-Mora (2007) formulate the multi-item make-to-stock queue with average cost criterion as a Restless Bandit Problem. Niño-Mora (2007) shows that the marginal productivity index obtained from the restless bandit formulation coincides with the myopic(T) scheduling policy in the case of linear holding and backordering costs, and further extends it to models with convex nonlinear cost rates and/or discounted costs.…”
Section: Literature Reviewmentioning
confidence: 99%
“…We therefore seek heuristic policies whose performance come close to being cost minimising. An approach espoused by Whittle [15] is to seek a calibration of each machine in the form of an index, namely a real-valued function defined on its state space. Write γ m : N × R + × (R + ∪ { * }) → R + for the (continuous) index for machine m, which in the current context would be some measure of congestion at that machine moderated to take account of machine availability.…”
Section: Commentsmentioning
confidence: 99%