Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Multi-agent reinforcement learning (MARL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. In some practical scenarios this may not be the case. We propose a novel MARL approach in which agents are allowed to rehearse with information that will not be available during policy execution. The key is for the agents to learn policies that do not explicitly rely on these rehearsal features. We also establish a weak convergence result for our algorithm, RLaR, demonstrating that RLaR converges in probability when certain conditions are met. We show experimentally that incorporating rehearsal features can enhance the learning rate compared to non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems. We also compare RLaR against an existing approximate Dec-POMDP solver which, like RLaR, does not assume a priori knowledge of the model. While RLaR's policy representation is not as scalable, we show that RLaR produces higher quality policies for most problems and horizons studied.
We adapt a scalable layered intelligence technique from the game industry, for agent-based crowd simulation. We extend this approach for planned movements, pursuance of assignable goals, and avoidance of dynamically introduced obstacles/threats as well as congestions, while keeping the system scalable with the number of agents. We demonstrate the various behaviors in hall-evacuation scenarios, and experimentally establish the scalability of the frame rates with increasing numbers of agents.
We present a novel automated technique for the quantitative validation and comparison of multi-agent based crowd egress simulation systems. Despite much progress in the simulation technology itself, little attention has been accorded to the problem of validating these systems against reality. Previous approaches focused on local (spatial or temporal) crowd patterns, and either resorted to visual comparison (e.g., U-shaped crowd at bottlenecks), or relied on ad-hoc applications of measures such as egress rates, densities, etc. to compare with reality. To the best of our knowledge, we offer the first systematic and unified approach to validate the global performance of a multi-agent based crowd egress simulation system. We employ this technique to evaluate a multi-agent based crowd egress simulation system that we have also recently developed, and compare two different simulation technologies in this system.
Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multiagent systems where agents operate with noisy sensors and actuators, as well as local information. Prevalent solution techniques are centralized and model based—limitations that we address by distributed reinforcement learning (RL). We particularly favor alternate learning, where agents alternately learn best responses to each other, which appears to outperform concurrent RL. However, alternate learning requires an initial policy. We propose two principled approaches to generating informed initial policies: a naive approach that lays the foundation for a more sophisticated approach. We empirically demonstrate that the refined approach produces near-optimal solutions in many challenging benchmark settings, staking a claim to being an efficient (and realistic) approximate solver in its own right. Furthermore, alternate best response learning seeded with such policies quickly learns high-quality policies as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.