A general model of decentralized stochastic control called partial history sharing information structure is presented. In this model, at each step the controllers share part of their observation and control history with each other. This general model subsumes several existing models of information sharing as special cases. Based on the information commonly known to all the controllers, the decentralized problem is reformulated as an equivalent centralized problem from the perspective of a coordinator. The coordinator knows the common information and select prescriptions that map each controller's local information to its control actions. The optimal control problem at the coordinator is shown to be a partially observable Markov decision process (POMDP) which is solved using techniques from Markov decision theory. This approach provides (a) structural results for optimal strategies, and (b) a dynamic program for obtaining optimal strategies for all controllers in the original decentralized problem. Thus, this approach unifies the various ad-hoc approaches taken in the literature. In addition, the structural results on optimal control strategies obtained by the proposed approach cannot be obtained by the existing generic approach (the person-by-person approach) for obtaining structural results in decentralized problems; and the dynamic program obtained by the proposed approach is simpler than that obtained by the existing generic approach (the designer's approach) for obtaining dynamic programs in decentralized problems.
We consider a remote estimation problem with an energy harvesting sensor and a remote estimator.The sensor observes the state of a discrete-time source which may be a finite state Markov chain or a multi-dimensional linear Gaussian system. It harvests energy from its environment (say, for example, through a solar cell) and uses this energy for the purpose of communicating with the estimator. Due to the randomness of energy available for communication, the sensor may not be able to communicate all the time. The sensor may also want to save its energy for future communications. The estimator relies on messages communicated by the sensor to produce real-time estimates of the source state. We consider the problem of finding a communication scheduling strategy for the sensor and an estimation strategy for the estimator that jointly minimize an expected sum of communication and distortion costs over a finite time horizon. Our goal of joint optimization leads to a decentralized decision-making problem. By viewing the problem from the estimator's perspective, we obtain a dynamic programming characterization for the decentralized decision-making problem that involves optimization over functions. Under some symmetry assumptions on the source statistics and the distortion metric, we show that an optimal communication strategy is described by easily computable thresholds and that the optimal estimate is a simple function of the most recently received sensor observation.
A model of stochastic games where multiple controllers jointly control the evolution of the state of a dynamic system but have access to different information about the state and action processes is considered. The asymmetry of information among the controllers makes it difficult to compute or characterize Nash equilibria. Using common information among the controllers, the game with asymmetric information is shown to be equivalent to another game with symmetric information. Further, under certain conditions, a Markov state is identified for the equivalent symmetric information game and its Markov perfect equilibria are characterized. This characterization provides a backward induction algorithm to find Nash equilibria of the original game with asymmetric information in pure or behavioral strategies. Each step of this algorithm involves finding Bayesian Nash equilibria of a one-stage Bayesian game. The class of Nash equilibria of the original game that can be characterized in this backward manner are named common information based Markov perfect equilibria.
The n-step delayed sharing information structure is investigated. This information structure comprises of K controllers that share their information with a delay of n time steps. This information structure is a link between the classical information structure, where information is shared perfectly between the controllers, and a non-classical information structure, where there is no "lateral" sharing of information among the controllers. Structural results for optimal control strategies for systems with such information structures are presented. A sequential methodology for finding the optimal strategies is also derived. The solution approach provides an insight for identifying structural results and sequential decomposition for general decentralized stochastic control problems.
We consider a networked control system consisting of a remote controller and a collection of linear plants, each associated with a local controller. Each local controller directly observes the state of its co-located plant and can inform the remote controller of the plant's state through an unreliable uplink channel. We assume that the downlink channels from the remote controller to local controllers are perfect. The objective of the local controllers and the remote controller is to cooperatively minimize a quadratic performance cost. We provide a dynamic program for this decentralized control problem using the common information approach. Although our problem is not a partially nested problem, we obtain explicit optimal strategies for all controllers. In the optimal strategies, all controllers compute common estimates of the states of the plants based on the common information obtained from the communication network. The remote controller's action is linear in the common state estimates, and the action of each local controller is linear in both the actual state of its co-located plant and the common state estimates.We illustrate our results with numerical experiments using randomly generated models. Contributions of the PaperThe main contributions of the paper are as follows.1) We investigate a decentralized stochastic control problem in which local controllers send their information to a remote controller over unreliable links. To the best of our knowledge, this is the first paper that solves an optimal decentralized control problem with unreliable communication between controllers (in contrast to problems in networked control systems and remote estimation problems where the unreliable communication is between sensors/encoders and controller or between controllers and actuators).2) The information structure of our problem is not partially nested, hence we cannot a priori restrict to linear strategies for optimal control. We use ideas from the common information approach of [43] to compute optimal controllers. Since the state and action spaces of our problem are Euclidean spaces, the results and arguments of [43] for finite spaces cannot be directly applied. We provide a complete set of results to adapt the common information approach to our linear-quadratic setting with non-partially nested information structure.Our rigorous proofs carefully handle the issues of measurability constraints, the existence of well-defined value functions and infinite dimensional strategy spaces.3) We show that the optimal control strategies of this problem admit simple structures-the optimal remote control is linear in the common estimates of system states and each optimal local control is linear in both the common estimates of system states and the perfectly observed local state. The main strengths of our result are that (i) it provides a simple strategy that is proven to be optimal: not only is the strategy in Theorem 3 linear, it uses estimates that can be easily updated; (ii) it provides a tractable way of computing the gain ma...
Abstract. We consider a class of two-player dynamic stochastic nonzero-sum games where the state transition and observation equations are linear, and the primitive random variables are Gaussian. Each controller acquires possibly different dynamic information about the state process and the other controller's past actions and observations. This leads to a dynamic game of asymmetric information among the controllers. Building on our earlier work on finite games with asymmetric information, we devise an algorithm to compute a Nash equilibrium by using the common information among the controllers. We call such equilibria common information based Markov perfect equilibria of the game, which can be viewed as a refinement of Nash equilibrium in games with asymmetric information. If the players' cost functions are quadratic, then we show that under certain conditions a unique common information based Markov perfect equilibrium exists. Furthermore, this equilibrium can be computed by solving a sequence of linear equations. We also show through an example that there could be other Nash equilibria in a game of asymmetric information, not corresponding to common information based Markov perfect equilibria.1. Introduction. A game models a scenario where multiple strategic controllers (or players) optimize their objective functionals, which depend not only on the self actions but also on the actions of other controllers. In stochastic static games, players observe the realization of some random state of nature, possibly through separate noisy channels, and use such observations to independently determine their actions so that the expected values of their individual cost (or utility) functions are optimized. In a stochastic dynamic game, on the other hand, the players act at multiple time steps, based on observation or measurement of some dynamic process which itself is driven by past actions as well as random quantities, which could again be called random states of nature. What information each player acquires at each stage of the game determines what is called the information structure of the underlying game. If all the players acquire the same information at each time step, then the dynamic game is said to be a game of symmetric information. However, in many real scenarios, the players do not have access to the same information about the underlying state processes and other players' observations and past actions. Such games are known as games with asymmetric information. For example, several problems in economic interactions [1][2][3], attacks on cyber-physical systems [4], auctions, cryptography, etc. can be modeled as games of asymmetric information among strategic players.Games with symmetric and/or perfect information have been well studied in the literature; see, for example, [5][6][7][8][9]. In these games, the players have the same beliefs on the states of the game, future observations and future expected costs or payoffs. However, in games with asymmetric information, the players need not have the same beliefs on th...
This paper introduces a new concept for a smart wireless sensor web technology for optimal measurements of surface-to-depth profiles of soil moisture using in-situ sensors. The objective of the technology, supported by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is to enable a guided and adaptive sampling strategy for the in-situ sensor network to meet the measurement validation objectives of spaceborne soil moisture sensors. A potential application for this technology is the validation of products from the Soil Moisture Active/Passive (SMAP) mission. Spatially, the total variability in soil-moisture fields comes from variability in processes on various scales. Temporally, variability is caused by external forcings, landscape heterogeneity, and antecedent conditions. Installing a dense in-situ network to sample the field continuously in time for all ranges of variability is impractical. However, a sparser but smarter network with an optimized measurement schedule can provide the validation estimates by operating in a guided fashion with guidance from its own sparse measurements. The feedback and control take place in the context of a dynamic physics-based hydrologic and sensor modeling system. The overall design of the smart sensor web-including the control architecture, physics-based hydrologic and sensor models, and actuation and communication hardware-is presented in this paper. We also present results illustrating sensor scheduling and estimation strategies as well as initial numerical and field demonstrations of the sensor web concept. It is shown that the coordinated operation of sensors through the control policy results in substantial savings in resource usage.
It is well-known that linear dynamical systems with Gaussian noise and quadratic cost (LQG) satisfy a separation principle. Finding the optimal controller amounts to solving separate dual problems; one for control and one for estimation. For the discrete-time finite-horizon case, each problem is a simple forward or backward recursion. In this paper, we consider a generalization of the LQG problem with two controllers and a partially nested information structure. Each controller is responsible for one of two system inputs, but has access to different subsets of the available measurements. Our paper has three main contributions. First, we prove a fundamental structural result: sufficient statistics for the controllers can be expressed as conditional means of the global state. Second, we give explicit state-space formulae for the optimal controller. These formulae are reminiscent of the classical LQG solution with dual forward and backward recursions, but with the important difference that they are intricately coupled. Lastly, we show how these recursions can be solved efficiently, with computational complexity comparable to that of the centralized problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.