Abstract-This tutorial paper provides a comprehensive characterization of information structures in team decision problems and their impact on the tractability of team optimization. Solution methods for team decision problems are presented in various settings where the discussion is structured in two foci: The first is centered on solution methods for stochastic teams admitting state-space formulations. The second focus is on norm-optimal control for linear plants under information constraints.
Abstract-There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In stochastic dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic games, and study their convergence for the weakly acyclic case which includes team problems as an important special case. The algorithm is decentralized in that each decision maker has access to only its local information, the state information, and the local cost realizations; furthermore, it is completely oblivious to the presence of other decision makers. We show that these algorithms converge to equilibrium policies almost surely in large classes of stochastic games.
Abstract-This paper studies the decentralized quadratic cheap talk and signaling game problems when an encoder and a decoder, viewed as two decision makers, have misaligned objective functions. The main contributions of this study are the extension of Crawford and Sobel's cheap talk formulation to multi-dimensional sources and to noisy channel setups. We consider both (simultaneous) Nash equilibria and (sequential) Stackelberg equilibria. We show that for arbitrary scalar sources, in the presence of misalignment, the quantized nature of all equilibrium policies holds for Nash equilibria in the sense that all Nash equilibria are equivalent to those achieved by quantized encoder policies. On the other hand, all Stackelberg equilibria policies are fully informative. For multi-dimensional setups, unlike the scalar case, Nash equilibrium policies may be of non-quantized nature, and even linear. In the noisy setup, a Gaussian source is to be transmitted over an additive Gaussian channel. The goals of the encoder and the decoder are misaligned by a bias term and encoder's cost also includes a penalty term on signal power. Conditions for the existence of affine Nash equilibria as well as general informative equilibria are presented. For the noisy setup, the only Stackelberg equilibrium is the linear equilibrium when the variables are scalar. Our findings provide further conditions on when affine policies may be optimal in decentralized multi-criteria control problems and lead to conditions for the presence of active information transmission in strategic environments.
It is known that state-dependent, multi-step Lyapunov bounds lead to greatly simplified verification theorems for stability for large classes of Markov chain models. This is one component of the "fluid model" approach to stability of stochastic networks. In this paper we extend the general theory to randomized multi-step Lyapunov theory to obtain criteria for stability and steady-state performance bounds, such as finite moments.These results are applied to a remote stabilization problem, in which a controller receives measurements from an erasure channel with limited capacity. Based on the general results in the paper it is shown that stability of the closed loop system is assured provided that the channel capacity is greater than the logarithm of the unstable eigenvalue, plus an additional correction term. The existence of a finite second moment in steady-state is established under additional conditions.
We investigate control of a non-linear process when communication and processing capabilities are limited. The sensor communicates with a controller node through an erasure channel which introduces i.i.d. packet dropouts. Processor availability for control is random and, at times, insufficient to calculate plant inputs. To make efficient use of communication and processing resources, the sensor only transmits when the plant state lies outside a bounded target set. Control calculations are triggered by the received data. If a plant state measurement is successfully received and while the processor is available for control, the algorithm recursively calculates a sequence of tentative plant inputs, which are stored in a buffer for potential future use. This safeguards for time-steps when the processor is unavailable for control. We derive sufficient conditions on system parameters for stochastic stability of the closed loop and illustrate performance gains through numerical studies.
Abstract-We consider the problem of remotely controlling a continuous-time linear time-invariant system driven by Brownian motion process, when communication takes place over noisy memoryless discrete-or continuous-alphabet channels. What makes this class of remote control problems different from most of the previously studied models is the presence of noise in both the forward channel (connecting sensors to the controller) and the reverse channel (connecting the controller to the plant). For stability of the closed-loop system, we look for the existence of an invariant distribution for the state, for which we show that it is necessary that the entire control space and the state space be encoded, and that the reverse channel be at least as reliable as the forward channel. We obtain necessary conditions and sufficient conditions on the channels and the controllers for stabilizability. Using properties of the underlying sampled Markov chain, we show that under variable-length coding and some realistic channel conditions, stability can be achieved over discrete-alphabet channels even if the entire state and control spaces are to be encoded and the number of bits that can be transmitted per unit time is strictly bounded. For control over continuous-alphabet channels, however, a variable rate scheme is not necessary. We also show that memoryless policies are rate-efficient for Gaussian channels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.