Abstract-We consider a single server serving a time-slotted queued system of multiple flows, where not more than one channel can be serviced in a single time slot. Each flow has exogenous arrivals, and the service rates to the flows vary over time according to a fixed distribution. The server is allowed to observe the service rates for only a single subset of flows (chosen from a fixed collection of observable subsets) in a time slot for the purpose of making scheduling decisions. We provide a precise characterization of the stability region for such a system. We present an online scheduling algorithm that uses information about marginal distributions to pick the subset and the MaxWeight rule to pick a flow within the subset, and show that it is throughput-optimal. In the case where the observable subsets are all disjoint, we show that a simple scheduling algorithm -Max-Sum-Queue -that essentially picks subsets having the largest squared-sum of queues, followed by MaxWeight within the subset, is throughput-optimal. We show that for channels which are symmetric with respect to channel rates and distributions, and fixed-size observable subsets, MaxSum-Queue is throughput-optimal. Finally, we demonstrate that under certain conditions, Max-Sum-Queue may not be throughput-optimal.
We consider a restless multi-armed bandit in which each arm can be in one of two states. When an arm is sampled, the state of the arm is not available to the sampler. Instead, a binary signal with a known randomness that depends on the state of the arm is available. No signal is available if the arm is not sampled. An arm-dependent reward is accrued from each sampling. In each time step, each arm changes state according to known transition probabilities which in turn depend on whether the arm is sampled or not sampled. Since the state of the arm is never visible and has to be inferred from the current belief and a possible binary signal, we call this the hidden Markov bandit. Our interest is in a policy to select the arm(s) in each time step that maximizes the infinite horizon discounted reward. Specifically, we seek the use of Whittle's index in selecting the arms.We first analyze the single-armed bandit and show that in general, it admits an approximate threshold-type optimal policy when there is a positive reward for the 'no-sample' action. We also identify several special cases for which the threshold policy is indeed the optimal policy. Next, we show that such a singlearmed bandit also satisfies an approximate-indexability property. For the case when the single-armed bandit admits a thresholdtype optimal policy, we perform the calculation of the Whittle index for each arm. Numerical examples illustrate the analytical results.
We study infection spreading on large static net works when the spread is assisted by a small number of addi tional virtually mobile agents. For networks which are "spatially constrained", we show that the spread of infection can be significantly sped up even by a few virtually mobile agents acting randomly. More specifically, for general networks with bounded virulence (e.g., a single or finite number of random virtually mobile agents), we derive upper bounds on the order of the time taken (as a function of network size) for infection to spread. Conversely, for certain common classes of networks such as linear graphs, grids and random geometric graphs, we also derive lower bounds on the order of the spreading time over all (potentially network-state aware and adversarial) virtual mobility strategies. We show that up to a logarithmic factor, these lower bounds for adversarial virtual mobility match the upper bounds on spreading via an agent with random virtual mobility.This demonstrates that random, state-oblivious virtual mobility is in fact order-wise optimal for dissemination in such spatially constrained networks.
Abstract-We study epidemic spreading processes in large networks, when the spread is assisted by a small number of external agents: infection sources with bounded spreading power, but whose movement is unrestricted vis-à-vis the underlying network topology. For networks which are 'spatially constrained', we show that the spread of infection can be significantly speeded up even by a few such external agents infecting randomly. Moreover, for general networks, we derive upper-bounds on the order of the spreading time achieved by certain simple (random/greedy) external-spreading policies. Conversely, for certain common classes of networks such as line graphs, grids and random geometric graphs, we also derive lower bounds on the order of the spreading time over all (potentially networkstate aware and adversarial) external-spreading policies; these adversarial lower bounds match (up to logarithmic factors) the spreading time achieved by an external agent with a random spreading policy. This demonstrates that random, state-oblivious infection-spreading by an external agent is in fact order-wise optimal for spreading in such spatially constrained networks.
We consider minimisation of dynamic regret in non-stationary bandits with a slowly varying property. Namely, we assume that arms' rewards are stochastic and independent over time, but that the absolute difference between the expected rewards of any arm at any two consecutive time-steps is at most a drift limit δ > 0. For this setting that has not received enough attention in the past, we give a new algorithm which extends naturally the well-known Successive Elimination algorithm to the non-stationary bandit setting. We establish the first instance-dependent regret upper bound for slowly varying non-stationary bandits. The analysis in turn relies on a novel characterization of the instance as a detectable gap profile that depends on the expected arm reward differences. We also provide the first minimax regret lower bound for this problem, enabling us to show that our algorithm is essentially minimax optimal. Also, this lower bound we obtain matches that of the more general total variation-budgeted bandits problem, establishing that the seemingly easier former problem is at least as hard as the more general latter problem in the minimax sense. We complement our theoretical results with experimental illustrations.
We consider the problem of adaptively PAC-learning a probability distribution 𝒫's mode by querying an oracle for information about a sequence of i.i.d. samples X1, X2, … generated from 𝒫. We consider two different query models: (a) each query is an index i for which the oracle reveals the value of the sample Xi, (b) each query is comprised of two indices i and j for which the oracle reveals if the samples Xi and Xj are the same or not. For these query models, we give sequential mode-estimation algorithms which, at each time t, either make a query to the corresponding oracle based on past observations, or decide to stop and output an estimate for the distribution's mode, required to be correct with a specified confidence. We analyze the query complexity of these algorithms for any underlying distribution 𝒫, and derive corresponding lower bounds on the optimal query complexity under the two querying models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.