Solving Transition Independent Decentralized Markov Decision Processes

Becker, Raphen; Zilberstein, Shlomo; Lesser, Victor; Goldman, Claudia V.

doi:10.1613/jair.1497

Cited by 116 publications

(141 citation statements)

References 20 publications

(28 reference statements)

Supporting

Mentioning

140

Contrasting

Order By: Relevance

“…Some examples include distributed robot control [2,9] and networking problems [3,12]. In multiagent domains, robot sensors not only provide uncertain and incomplete information about their own state, but also about the location of the other robots.…”

mentioning

confidence: 99%

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Amato

Bernstein

Zilberstein

2009

Auton Agent Multi-Agent Syst

123

158

View full text Add to dashboard Cite

POMDPs and their decentralized multiagent counterparts, DEC-POMDPs, offer a rich framework for sequential decision making under uncertainty. Their high computational complexity, however, presents an important research challenge. One way to address the intractable memory requirements of current algorithms is based on representing agent policies as finite-state controllers. Using this representation, we propose a new approach that formulates the problem as a nonlinear program, which defines an optimal policy of a desired size for each agent. This new formulation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs and DEC-POMDPs. Although solving the NLP optimally is often intractable, the results we obtain using an off-the-shelf optimization method are competitive with state-of-the-art POMDP algorithms and outperform state-of-the-art DEC-POMDP algorithms. Our approach is easy to implement and it opens up promising research directions for solving POMDPs and DEC-POMDPs using nonlinear programming methods.

show abstract

mentioning

confidence: 99%

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Amato

Bernstein

Zilberstein

2009

Auton Agent Multi-Agent Syst

123

158

View full text Add to dashboard Cite

show abstract

“…[18] proposes a solution based on Dynamic Programming (DP), while [19] extends point-based DP to the case of decentralized agents, and [20] applies heuristic search. Work in [21] has focused on a simpler model, denoted transition-independent where agents do not affect each other's state but cooperate via a joint reward signal instead. The difference between these algorithms and our work lies in their assumption that the agents' internal states are discrete and their modeling of a low number of action outcomes.…”

Section: Decision Theoretic Multiagent Planningmentioning

confidence: 99%

“…The assumption is realistic since many proprioceptive signals are accessible to the measure of the agent itself but are not communicated to other agents. Therefore the approach in [21] is of particular interest to us in this paper. Our model of interest is thus that of transition independent DECHMDPs (TI-DEC-HMDPs), where agents do not communicate but are subjected to internal resource constraints.…”

Section: Decision Theoretic Multiagent Planningmentioning

confidence: 99%

“…Our model of interest is thus that of transition independent DECHMDPs (TI-DEC-HMDPs), where agents do not communicate but are subjected to internal resource constraints. It follows that our theoretical approach remains within the brackets of [21]. This approach is attractive because it allows to consider each agent individually: the core idea is to compute the coverage set of each agent, that is the set that regroups the agent strategies such that all of the possible behaviors of the other agents are optimally covered.…”

Section: Decision Theoretic Multiagent Planningmentioning

confidence: 99%

See 1 more Smart Citation

Planning in stochastic domains for multiple agents with individual continuous resource state-spaces

Benazera¹

2010

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

An approximation method is proposed that solves a class of Decentralized hybrid Markov Decision Processes (DEC-HMDPs). These DEC-HMDPs have both discrete and continuous state variables and represent individual agents with continuous measurable statespaces, such as resources. Adding to the natural complexity of decentralized problems, continuous state variables lead to a blowup in potential decision points. Representing value functions as Rectangular Piecewise Constant (RPWC) functions, we formalize and detail an extension to the Coverage Set Algorithm (CSA) (Becker et al., J Artif Intell Res, 22, 2004) that solves transition independent DEC-HMDPs with controlled error. The resource constraints of each agent lead to problems that are over-subscribed in the number of agents, that is where some agents have no role to play. Based on our extension to the CSA, two heuristics are proposed that allow A*-like search to find the minimal optimal team of agents that is solution to a given problem. We apply and test our algorithms on a range of multi-robot exploration problems with continuous resource constraints.

show abstract

“…Recently, several researchers have concentrated on finding approximate solutions or finding new subsets of DEC-POMDP problems which are easier to solve and can model some real world problems (Becker et al 2004;Goldman and Zilberstein 2004). Although current approximate solution algorithms are able to solve slightly larger problems than exact solution algorithms, they still need considerable improvement in order to handle real world problems involving very large state spaces (Bernstein et al 2005;Nair et al 2003).…”

Section: Introductionmentioning

confidence: 99%

Using evolution strategies to solve DEC-POMDP problems

Eker

Akın

2008

Soft Comput

View full text Add to dashboard Cite

Decentralized partially observable Markov decision process (DEC-POMDP) is an approach to model multi-robot decision making problems under uncertainty. Since it is NEXP-complete there is no efficient exact algorithm to solve these problems and in spite of the attention it has taken recently, so far only a few approximate solutions that can solve small problems have been proposed. In this study, we offer a novel approximate solution algorithm for DEC-POMDP problems using evolution strategies, and a novel approach to approximately calculate the fitness of the chromosomes which correspond to the expected reward. We also propose a new problem which is a more complex, modified version of the grid meeting problem and solve it. Our results show that our algorithm is scalable and we can solve problems that have more states than the problems attempted in previous studies.

show abstract

Solving Transition Independent Decentralized Markov Decision Processes

Cited by 116 publications

References 20 publications

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Planning in stochastic domains for multiple agents with individual continuous resource state-spaces

Using evolution strategies to solve DEC-POMDP problems

Contact Info

Product

Resources

About