2021
DOI: 10.48550/arxiv.2106.05087
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL

Abstract: Evaluating the worst-case performance of a reinforcement learning (RL) agent under the strongest/optimal adversarial perturbations on state observations (within some constraints) is crucial for understanding the robustness of RL agents. However, finding the optimal adversary is challenging, in terms of both whether we can find the optimal attack and how efficiently we can find it. Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or directly train a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(32 citation statements)
references
References 17 publications
0
32
0
Order By: Relevance
“…Adversarial Training does not improve agent's robust performance. As we observed in Figure 2, adversarial training (AT) does not improve the robustness of the agent in both discrete and continuous action space, although AT achieves good robustness againt p attacks in vision tasks [23,48], and against observation attacks [49,40] and action attacks [32] in RL. We hypothesize that it is due to (1) the large number of total agents, (2) the uncertainty of adversarial message channels, and (3) the relatively large perturbation length.…”
Section: C3 Additional Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Adversarial Training does not improve agent's robust performance. As we observed in Figure 2, adversarial training (AT) does not improve the robustness of the agent in both discrete and continuous action space, although AT achieves good robustness againt p attacks in vision tasks [23,48], and against observation attacks [49,40] and action attacks [32] in RL. We hypothesize that it is due to (1) the large number of total agents, (2) the uncertainty of adversarial message channels, and (3) the relatively large perturbation length.…”
Section: C3 Additional Resultsmentioning
confidence: 99%
“…We hypothesize that it is due to (1) the large number of total agents, (2) the uncertainty of adversarial message channels, and (3) the relatively large perturbation length. To be more specific, in related works [32,49,40], an agent and an attacker are alternately trained, during which the agent learns to adapt to the learned attacker. However, in the threat model we consider, there are C out of N messages significantly perturbed, which it is hard for the agent to adapt to attacks during alternate training.…”
Section: C3 Additional Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Hence, it is imperative to study RL under an adversarial environment. In the last five years, there is a surge in terms of the number of papers that studies the security issues of RL [104,105,106,107,108,109,110,111,112,113,114,115,116,117]. The vulnerabilities of RL comes from the information exchange between the agent and the environment.…”
Section: Reinforcement Learning In Adversarial Environmentmentioning
confidence: 99%