Multi-Armed Bandits and the Gittins Index

Whittle, Peter

doi:10.1111/j.2517-6161.1980.tb01111.x

Cited by 310 publications

(219 citation statements)

References 10 publications

Supporting

Mentioning

217

Contrasting

Unclassified

Order By: Relevance

“…The following is a simple consequence of Lemma 4 and a result due to Whittle [22]. THEOREM 5 (optimal replenishment, mixed model): If Condition 1 holds with the ?ik being the qualifying sequences, then in the state in which nk, lifts of ammunition are available for combat aboard ship kl, 1 I 1 I M k , 1 5 k 2 g, the next lift in an optimal replenishment strategy should be to any ship ij such that n,, < L,, and where G,(El, .)…”

Section: Mixed Modelsmentioning

confidence: 98%

“…When this is the case, it will follow from Lemma 4 that there is a globally optimal strategy which Although such a strategy always enjoys a constrained optimality property (Lemma 4) and would seem invariably to be a sensible heuristic, it nevertheless remains of interest to determine the conditions under which it is globally optimal. Results due to Whittle [22] and Glazebrook [8], which apply quite generally to a class of discounted Markov decision processes in parallel, yield the following prescription:…”

Section: Mixed Modelsmentioning

confidence: 99%

See 1 more Smart Citation

Optimal sequential replenishment of ships during combat

Pilnick

Glazebrook

Gaver

1991

Naval Research Logistics

View full text Add to dashboard Cite

A carrier battle group is operating in an area where it is subject to attack by enemy aircraft. It is anticipated that air raids will occur in large waves. The uncertain time between raids is available for the replenishment of supplies. We consider the problem of how best to schedule ammunition replenishment during this period. The theory of Gittins indices provides the technical background to the development of a range of models which yield a hierarchy of index‐based heuristics for replenishment. One such heuristic is assessed computationally in a more realistic scenario than is explicitly allowed for by the models.

show abstract

Section: Mixed Modelsmentioning

confidence: 98%

Section: Mixed Modelsmentioning

confidence: 99%

Optimal sequential replenishment of ships during combat

Pilnick

Glazebrook

Gaver

1991

Naval Research Logistics

View full text Add to dashboard Cite

show abstract

“…In this section, we formulate the node selection problem as a partially observed Markov decision process (POM-DP) multi-arm bandit system [15], which has been widely studied in operations research in the context of an infinite-horizon discounted cost stochastic control problems [16,17]. This problem is studied to make the optimal decision of which arm of the multi-slot gambler machine to pull each time to maximize the total reward.…”

Section: Solving the Node Selection Problemmentioning

confidence: 99%

Distributed node selection for threshold key management with intrusion detection in mobile ad hoc networks

Tang

2010

Wireless Netw

View full text Add to dashboard Cite

In mobile ad hoc networks (MANETs), identity (ID)-based cryptography with threshold secret sharing is a popular approach for the security design. Most previous work for key management in this framework concentrates on the protocols and structures. Consequently, how to optimally conduct node selection in ID-based cryptography with threshold secret sharing is largely ignored. In this paper, we propose a distributed scheme to dynamically select nodes with master key shares to do the private key generation service. The proposed scheme can minimize the overall threat posed to the MANET while simultaneously taking into account of the cost (e.g., energy consumption) of using these nodes. Intrusion detection systems are modeled as noisy sensors to derive the system security situations. We use stochastic system to formulate the MANET to obtain the optimal policy. Simulation results are presented to illustrate the effectiveness of the proposed scheme.

show abstract

“…For independent projects, however, it was shown first by Gittins and Jones [6] that there exists a projectspecific dynamic performance measure, later called the Gittins index of a project, such that optimal allocations are obtained from an index policy which (essentially) amounts to focussing at each point only on those projects which exhibit a maximal Gittins index. This celebrated result was subsequently extended from Gittins' and Jones' original discrete-time, Markovian framework to a completely general continuous-time setting; see, e.g., Whittle [15], Varaiya, Walrand and Buyukkoc [13], Mandelbaum [10], Weber [14], El Karoui and Karatzas [3,4], Kaspi and Mandelbaum [8,9].…”

Section: Introductionmentioning

confidence: 96%

On Gittins’ index theorem in continuous time

Bank

Küchler

2007

Stochastic Processes and their Applications

View full text Add to dashboard Cite

We give a new and comparably short proof of Gittins' index theorem for dynamic allocation problems of the multi-armed bandit type in continuous time under minimal assumptions. This proof gives a complete characterization of optimal allocation strategies as those policies which follow the current leader among the Gittins indices while ensuring that a Gittins index is at an all-time low whenever the associated project is not worked on exclusively. The main tool is a representation property of Gittins index processes which allows us to show that these processes can be chosen to be pathwise lower semi-continuous from the right and quasi-lower semi-continuous from the left. Both regularity properties turn out to be crucial for our characterization and the construction of optimal allocation policies.

show abstract

Multi-Armed Bandits and the Gittins Index

Cited by 310 publications

References 10 publications

Optimal sequential replenishment of ships during combat

Optimal sequential replenishment of ships during combat

Distributed node selection for threshold key management with intrusion detection in mobile ad hoc networks

On Gittins’ index theorem in continuous time

Contact Info

Product

Resources

About