2021
DOI: 10.1109/lra.2021.3062803
|View full text |Cite
|
Sign up to set email alerts
|

PRIMAL$_2$: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning - Lifelong

Abstract: Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) -an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one -in dense and highly structured environments, typical of real-world warehouse operations. Effectively solving LMAPF in such environments requires expensive coordination be… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
55
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 82 publications
(62 citation statements)
references
References 28 publications
0
55
0
Order By: Relevance
“…Table 1 summarizes the DRL multi-robot path planning methods and the advantages and limitations of each method. From the information in Table 1, it can be summarized that shared parameter type algorithms such as MADDPG and ME-MADDPG can be used in dynamic and complex environments [1][2][3][4] ; decentralized architectures such as DQN and DDQN can be considered in stable environments [5][6][7] ; large robotic systems facing a large number of dynamic obstacles can be considered using algorithms such as A2C, A3C and TDueling [8][9][10][11] . Validity validated on only a few teams of agents.…”
Section: Drl Multi-robot Path Planning Methodsmentioning
confidence: 99%
“…Table 1 summarizes the DRL multi-robot path planning methods and the advantages and limitations of each method. From the information in Table 1, it can be summarized that shared parameter type algorithms such as MADDPG and ME-MADDPG can be used in dynamic and complex environments [1][2][3][4] ; decentralized architectures such as DQN and DDQN can be considered in stable environments [5][6][7] ; large robotic systems facing a large number of dynamic obstacles can be considered using algorithms such as A2C, A3C and TDueling [8][9][10][11] . Validity validated on only a few teams of agents.…”
Section: Drl Multi-robot Path Planning Methodsmentioning
confidence: 99%
“…More recently, some works have attempted to leverage machine-learning techniques for solving MAPF. These techniques learn from planning demonstrations collected offline to directly predict the next actions of agents given the current observations by means of reinforcement learning [9,43,50] or using graph neural networks [38,39]. Despite such progress, it remains challenging to determine how these techniques should be applied to MAPP in continuous spaces due to the inherent limitation that assumes the search space to be given a priori (typically as a grid map).…”
Section: Related Workmentioning
confidence: 99%
“…By contrast, some studies [5,8,9,24,25,37] have also considered an application to maze-like environments. For example, Damani et al [5] proposed pathfinding via reinforcement and imitation multiagent learning -lifelong (PRIMAL 2 ), a distributed reinforcement learning framework for a lifelong MAPF (LMAPF), which is a variant of the MAPF in which agents are repeatedly assigned new destinations. However, they assumed that tasks are sparsely generated at random locations, and thus, unlike our environment, no local congestion occurs.…”
Section: Related Workmentioning
confidence: 99%