We develop appropriately generalized notions of indexability for problems of dynamic resource allocation where the resource concerned may be assigned more flexibility than is allowed, for example, in classical multi-armed bandits. Most especially we have in mind the allocation of a divisible resource (manpower, money, equipment) to a collection of objects (projects) requiring it in cases where its over-concentration would usually be far from optimal. The resulting project indices are functions of both a resource level and a state. They have a simple interpretation as a fair charge for increasing the resource available to the project from the specified resource level when in the specified state. We illustrate ideas by reference to two model classes which are of independent interest. In the first, a pool of servers is assigned dynamically to a collection of service teams, each of which mans a service station. We demonstrate indexability under a natural assumption that the service rate delivered is increasing and concave in the team size. The second model class is a generalization of the spinning plates model for the optimal deployment of a divisible investment resource to a collection of reward generating assets. Asset indexability is established under appropriately drawn laws of diminishing returns for resource deployment. For both model classes numerical studies provide evidence that the proposed greedy index heuristic performs strongly. This is a class of models concerned with the sequential allocation of effort, to be thought of as a single indivisible resource, to a collection of stochastic reward generating projects (or bandits as they are sometimes called). Gittins demonstrated that optimal project choices are those of highest index. There is no doubt that the idea that strongly performing policies are determined by simple, interpretable calibrations (i.e., indices) of decision options is an attractive and powerful one and offers crucial computational benefits. There is now substantial literature describing extensions to and reformulations of Gittins' result. Some key contributions are cited in the recent survey of Mahajan and Teneketzis [14].Whittle [21] introduced a class of restless bandit problems (RBPs) as a means of addressing a critical limitation of Gittins' MABs, namely, that projects should remain frozen while not in receipt of effort. In RBPs, projects may change state while active or passive though according to different dynamics. However, this generalization is bought at great cost. In contrast to MABs, RBPs are almost certainly intractable having been shown to be PSPACE-hard by Papadimitriou and Tsitsiklis [16]. Whittle [21] proposed an index heuristic for those RBPs which pass an indexability test. This heuristic reduces to Gittins' index policy in the MAB case. Whittle's index emerges from a Lagrangian relaxation of the original problem and has an interpretation as a fair charge for the allocation of effort to a particular project in a particular state. Weber and Weiss [20] established a fo...
Queueing networks describe complex stochastic systems of both theoretical and practical interest. They provide the means to assess alterations, diagnose poor performance and evaluate robustness across sets of interconnected resources. In the present paper, we focus on the underlying continuoustime Markov chains induced by these networks, and we present a flexible method for drawing parameter inference in multi-class Markovian cases with switching and different service disciplines. The approach is directed towards the inferential problem with missing data, where transition paths of individual tasks among the queues are often unknown. The paper introduces a slice sampling technique with mappings to the measurable space of task transitions between the service stations. This can address time and tractability issues in computational procedures, handle prior system knowledge and overcome common restrictions on service rates across existing inferential frameworks. Finally, the proposed algorithm is validated on synthetic data and applied to a real data set, obtained from a service delivery tasking tool implemented in two university hospitals.
Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.
The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but are subject to an overall resource constraint. The contribution is motivated by the recent works of Glazebrook et al. (2011a), (2011b) who discussed the performance of index heuristics for resource allocation in such systems. Hitherto, index heuristics have been shown, under a condition of full indexability, to be optimal for a natural Lagrangian relaxation of such problems in which a resource is purchased rather than constrained. We find that under key assumptions about the nature of solutions to a deterministic differential equation that the index heuristics above are asymptotically optimal in a sense described by Whittle. We then demonstrate that these assumptions always hold for three-state bandits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.