In 1988 Whittle introduced an important but intractable class of restless bandit problems which generalise the multiarmed bandit problems of Gittins by allowing state evolution for passive projects. Whittle's account deployed a Lagrangian relaxation of the optimisation problem to develop an index heuristic. Despite a developing body of evidence (both theoretical and empirical) which underscores the strong performance of Whittle's index policy, a continuing challenge to implementation is the need to establish that the competing projects all pass an indexability test. In this paper we employ Gittins' index theory to establish the indexability of (inter alia) general families of restless bandits which arise in problems of machine maintenance and stochastic scheduling problems with switching penalties. We also give formulae for the resulting Whittle indices. Numerical investigations testify to the outstandingly strong performance of the index heuristics concerned.
We propose a general Markovian model for the optimal control of admissions and subsequent routing of customers for service provided by a collection of heterogeneous stations. Queue-length information is available to inform all decisions. Admitted customers will abandon the system if required to wait too long for service. The optimisation goal is the maximisation of reward rate earned from service completions, net of the penalties paid whenever admission is denied, and the costs incurred upon every customer loss through impatience. We show that the system is indexable under mild conditions on model parameters and give an explicit construction of an index policy for admission control and routing founded on a proposal of Whittle for restless bandits. We are able to gain insights regarding the strength of performance of the index policy from the nature of solutions to the Lagrangian relaxation used to develop the indices. These insights are strengthened by the development of performance bounds. Although we are able to assert the optimality of the index heuristic in a range of asymptotic regimes, the performance bounds are also able to identify instances where its performance is relatively weak. Numerical studies are used to illustrate and support the theoretical analyses.
We develop appropriately generalized notions of indexability for problems of dynamic resource allocation where the resource concerned may be assigned more flexibility than is allowed, for example, in classical multi-armed bandits. Most especially we have in mind the allocation of a divisible resource (manpower, money, equipment) to a collection of objects (projects) requiring it in cases where its over-concentration would usually be far from optimal. The resulting project indices are functions of both a resource level and a state. They have a simple interpretation as a fair charge for increasing the resource available to the project from the specified resource level when in the specified state. We illustrate ideas by reference to two model classes which are of independent interest. In the first, a pool of servers is assigned dynamically to a collection of service teams, each of which mans a service station. We demonstrate indexability under a natural assumption that the service rate delivered is increasing and concave in the team size. The second model class is a generalization of the spinning plates model for the optimal deployment of a divisible investment resource to a collection of reward generating assets. Asset indexability is established under appropriately drawn laws of diminishing returns for resource deployment. For both model classes numerical studies provide evidence that the proposed greedy index heuristic performs strongly. This is a class of models concerned with the sequential allocation of effort, to be thought of as a single indivisible resource, to a collection of stochastic reward generating projects (or bandits as they are sometimes called). Gittins demonstrated that optimal project choices are those of highest index. There is no doubt that the idea that strongly performing policies are determined by simple, interpretable calibrations (i.e., indices) of decision options is an attractive and powerful one and offers crucial computational benefits. There is now substantial literature describing extensions to and reformulations of Gittins' result. Some key contributions are cited in the recent survey of Mahajan and Teneketzis [14].Whittle [21] introduced a class of restless bandit problems (RBPs) as a means of addressing a critical limitation of Gittins' MABs, namely, that projects should remain frozen while not in receipt of effort. In RBPs, projects may change state while active or passive though according to different dynamics. However, this generalization is bought at great cost. In contrast to MABs, RBPs are almost certainly intractable having been shown to be PSPACE-hard by Papadimitriou and Tsitsiklis [16]. Whittle [21] proposed an index heuristic for those RBPs which pass an indexability test. This heuristic reduces to Gittins' index policy in the MAB case. Whittle's index emerges from a Lagrangian relaxation of the original problem and has an interpretation as a fair charge for the allocation of effort to a particular project in a particular state. Weber and Weiss [20] established a fo...
This paper concerns two families of Markov decision problem that fall within the family of (bi-directional) restless bandits, an intractable class of decision processes introduced by Whittle. The spinning plates problem concerns the optimal management of a portfolio of reward-generating assets whose yields grow with investment but otherwise tend to decline. In the model of asset exploitation called the squad system, the yield from an asset tends to decline when it is used but will recover when the asset is at rest. In all cases, simply stated conditions are given that guarantee indexability of the problem, together with conditions necessary and sufficient for its strict indexability. The index heuristics for asset activation that emerge from the analysis are assessed numerically and found to perform very strongly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.