In this article we consider the issue of optimal control in collaborative multi-agent systems with stochastic dynamics. The agents have a joint task in which they have to reach a number of target states. The dynamics of the agents contains additive control and additive noise, and the autonomous part factorizes over the agents. Full observation of the global state is assumed. The goal is to minimize the accumulated joint cost, which consists of integrated instantaneous costs and a joint end cost. The joint end cost expresses the joint task of the agents. The instantaneous costs are quadratic in the control and factorize over the agents. The optimal control is given as a weighted linear combination of single-agent to single-target controls. The single-agent to single-target controls are expressed in terms of diffusion processes. These controls, when not closed form expressions, are formulated in terms of path integrals, which are calculated approximately by Metropolis-Hastings sampling. The weights in the control are interpreted as marginals of a joint distribution over agent to target assignments. The structure of the latter is represented by a graphical model, and the marginals are obtained by graphical model inference. Exact inference of the graphical model will break down in large systems, and so approximate inference methods are needed. We use naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics. We compare the approximate inference methods with the exact solution, and we show that they can accurately compute the optimal control. Finally, we demonstrate the control method in multi-agent systems with nonlinear dynamics consisting of up to 80 agents that have to reach an equal number of target states.
In this article we consider the problem of stochastic optimal control in continuous-time and state-action space of systems with state constraints. These systems typically appear in the area of robotics, where hard obstacles constrain the state space of the robot. A common approach is to solve the problem locally using a linearquadratic Gaussian (LQG) method. We take a different approach and apply path integral control as introduced by Kappen (Kappen, H.
We consider multi-agent systems with stochastic non-linear dynamics in continuous space-time. We focus on systems of agents that aim to visit a number of given target locations at given points in time at minimal control cost. The online optimization of which agent has to visit which target requires the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a non-linear partial differential equation (PDE). Under some conditions, the log-transform can be applied to turn the HJB equation into a linear PDE. We then show that the optimal solution in the multi-agent scheduling problem can be expressed in closed form as a sum of single schedule solutions.
We study optimal control in large stochastic multi-agent systems in continuous space and time. We consider multi-agent systems where agents have independent dynamics with additive noise and control. The goal is to minimize the joint cost, which consists of a state dependent term and a term quadratic in the control. The system is described by a mathematical model, and an explicit solution is given. We focus on large systems where agents have to distribute themselves over a number of targets with minimal cost. In such a setting the optimal control problem is equivalent to a graphical model inference problem. Exact inference will be intractable, and we use the mean field approximation to compute accurate approximations of the optimal controls. We conclude that near to optimal control in large stochastic multi-agent systems is possible with this approach.
Several studies have shown that human motor behavior can be successfully described using optimal control theory, which describes behavior by optimizing the trade-off between the subject's effort and performance. This approach predicts that subjects reach the goal exactly at the final time. However, another strategy might be that subjects try to reach the target position well before the final time to avoid the risk of missing the target. To test this, we have investigated whether minimizing the control effort and maximizing the performance is sufficient to describe human motor behavior in time-constrained motor tasks. In addition to the standard model, we postulate a new model which includes an additional cost criterion which penalizes deviations between the position of the effector and the target throughout the trial, forcing arrival on target before the final time. To investigate which model gives the best fit to the data and to see whether that model is generic, we tested both models in two different tasks where subjects used a joystick to steer a ball on a screen to hit a target (first task) or one of two targets (second task) before a final time. Noise of different amplitudes was superimposed on the ball position to investigate the ability of the models to predict motor behavior for different levels of uncertainty. The results show that a cost function representing only a trade-off between effort and accuracy at the end time is insufficient to describe the observed behavior. The new model correctly predicts that subjects steer the ball to the target position well before the final time is reached, which is in agreement with the observed behavior. This result is consistent for all noise amplitudes and for both tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.