2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016
DOI: 10.1109/iros.2016.7759651
|View full text |Cite
|
Sign up to set email alerts
|

D<inf>++</inf>: Structural credit assignment in tightly coupled multiagent domains

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
10
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(14 citation statements)
references
References 12 publications
1
10
0
Order By: Relevance
“…Notably, MERL is able to learn on coupling greater than n = 6 where methods without explicit reward shaping have been shown to fail entirely [Rahmattalabi et al, 2016]. This is consistent with the performances of our baselines as none of them use explicit domain-specific reward shaping.…”
Section: Resultssupporting
confidence: 83%
“…Notably, MERL is able to learn on coupling greater than n = 6 where methods without explicit reward shaping have been shown to fail entirely [Rahmattalabi et al, 2016]. This is consistent with the performances of our baselines as none of them use explicit domain-specific reward shaping.…”
Section: Resultssupporting
confidence: 83%
“…In practice, optimizing with respect to f r rather than g improves the performance since f r is more sensitive to robot r ’s plan and the variance of f r is less affected by the uncertainty of the other robots’ plans (Wolpert et al, 2013). We chose this local utility function since it is generally applicable, although further performance improvements could be achieved with problem-specific heuristics (Rahmattalabi et al, 2016). We note that this formulation assumes that all robots know the global utility function g .…”
Section: Dec-mctsmentioning
confidence: 99%
“…Although the agents counld use the global utility o directly, optimising with respect to f i instead results in faster convergence since f i is less affected by the unknown plans of teammates, aka. World Utility (Rahmattalabi et al 2016).…”
Section: Phase 1: Local Sub-goal Tree Searchmentioning
confidence: 99%