Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence 2017
DOI: 10.24963/ijcai.2017/525
|View full text |Cite
|
Sign up to set email alerts
|

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

Abstract: We introduce two tactics, namely the strategicallytimed attack and the enchanting attack, to attack reinforcement learning agents trained by deep reinforcement learning algorithms using adversarial examples. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an ad… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
178
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 217 publications
(179 citation statements)
references
References 4 publications
0
178
1
Order By: Relevance
“…adversarial examples that change almost every pixel in the input state) has previously been generated by a white-box policy access based approach [21], where the adversarial examples are computed via backpropagation. Lin et al [4] proposed the strategically-timed attack and the so-called enchanting attack, but the adversary generation is still based on a white-box policy access assumption and full-state perturbation. Besides, Kos et al [22] compared the influence of full-state perturbations with random noise, and utilized the value function to guide the adversary injection.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…adversarial examples that change almost every pixel in the input state) has previously been generated by a white-box policy access based approach [21], where the adversarial examples are computed via backpropagation. Lin et al [4] proposed the strategically-timed attack and the so-called enchanting attack, but the adversary generation is still based on a white-box policy access assumption and full-state perturbation. Besides, Kos et al [22] compared the influence of full-state perturbations with random noise, and utilized the value function to guide the adversary injection.…”
Section: Related Workmentioning
confidence: 99%
“…These adversaries can easily fool even seemingly high performing deep learning models with human imperceptible perturbations. Such vulnerabilities of deep learning models have been well studied in supervised learning, and also to some extent in RL [3], [4], [20].…”
Section: B Adversarial Attackmentioning
confidence: 99%
See 2 more Smart Citations
“…One line of systematic study of the first question started in image classification, with seminal early observations from Szegedy et al (Szegedy et al, 2013) that deep artificial neural networks are brittle to adversarial change in inputs that would otherwise be imperceptible to the human eye. This computer vision weakness of the machine has been an angle of attack to design adversaries for reinforcement-learning agents (Lin et al, 2017), followed by general formal insights on adversarial reinforcement learning on the more classical bandit settings (Jun, Li, Ma, & Zhu, 2018). To analyse human choice frailty, our framework involves two steps, the key one also involving a machine-vs-machine adversarial step in which a (deep) reinforcement-learning agent is trained to be an adversary to an RNN; this latter model is trained in a previous step to emulate human decisions following (Dezfouli et al, 2018;Dezfouli, Ashtiani, et al, 2019;Dezfouli, Griffiths, et al, 2019).…”
Section: Introductionmentioning
confidence: 99%