2021
DOI: 10.1007/978-3-030-86855-0_12
|View full text |Cite
|
Sign up to set email alerts
|

Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…It is also worth noting a direction of research where planning algorithms are combined with reinforcement learning. In Skrynnik et al (2021) and Davydov et al (2021) , the authors train RL agents in a centralized (QMIX) and decentralized (PPO) way for solving multi-agent pathfinding tasks. The resulting RL policies are combined with a planning approach (MCTS), which leverages the resulting performance.…”
Section: Related Workmentioning
confidence: 99%
“…It is also worth noting a direction of research where planning algorithms are combined with reinforcement learning. In Skrynnik et al (2021) and Davydov et al (2021) , the authors train RL agents in a centralized (QMIX) and decentralized (PPO) way for solving multi-agent pathfinding tasks. The resulting RL policies are combined with a planning approach (MCTS), which leverages the resulting performance.…”
Section: Related Workmentioning
confidence: 99%
“…Compared to playing against a single policy, Smith et al (2020) claimed that such a mechanism brings stochasticity on the opponent and forgets previous experiences, making algorithms slow to converge. Thus, the authors proposed a method that distills the opponent mixture as a single policy via Q-mixing (Davydov et al, 2021).…”
Section: Related Workmentioning
confidence: 99%