2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017
DOI: 10.1109/iros.2017.8206245
|View full text |Cite
|
Sign up to set email alerts
|

Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
95
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
3

Relationship

2
8

Authors

Journals

citations
Cited by 110 publications
(95 citation statements)
references
References 15 publications
0
95
0
Order By: Relevance
“…RARL, along with similar methods [12], is able to achieve some robustness, but the level of variation seen during training may not be diverse enough to resemble the variety encountered in the real-world. Specifically, the adversary does not actively seek catastrophic outcomes as does the agent constructed in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…RARL, along with similar methods [12], is able to achieve some robustness, but the level of variation seen during training may not be diverse enough to resemble the variety encountered in the real-world. Specifically, the adversary does not actively seek catastrophic outcomes as does the agent constructed in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…However, the nature of the algorithm makes it computational intensive and increases the delay in detection because the target agent has to be fooled before the master agent can begin its defense procedure. Furthermore, the adversarially robust policy learning (ARPL) was proposed in [180]. This algorithm, targeted at the defense of autonomous agents in physical domains like self-driving cars and robots uses adversarial agents during the training of RL agents to make them resilient to adversarial attacks in the form of changes in the environment.…”
Section: B Defense Against Adversarial Attacks In Rlmentioning
confidence: 99%
“…All adversaries could be subsumed via a single adversary with large admissible set. However, the resulting dynamics would not capture underlying structure of the simulation gap [8] and the optimal policy would be too conservative [26]. Therefore, we disambiguate between the different adversaries to capture this structure.…”
Section: P Smentioning
confidence: 99%