2017
DOI: 10.1007/978-3-319-62416-7_19
|View full text |Cite
|
Sign up to set email alerts
|

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Abstract: Abstract. Deep learning classifiers are known to be inherently vulnerable to manipulation by intentionally perturbed inputs, named adversarial examples. In this work, we establish that reinforcement learning techniques based on Deep Q-Networks (DQNs) are also vulnerable to adversarial input perturbations, and verify the transferability of adversarial examples across different DQN models. Furthermore, we present a novel class of attacks based on this vulnerability that enable policy manipulation and induction i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
158
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 179 publications
(160 citation statements)
references
References 22 publications
1
158
0
Order By: Relevance
“…Behzadan & Munir [15] establish that adversaries can interfere with the training process of DQNs, preventing the victim from learning the correct policy. Specifically, the attacker applies minimum perturbation to the state observed by the target, so that a different action is chosen as the optimal action at the next state.…”
Section: Attacks Against Reinforcement Learningmentioning
confidence: 99%
“…Behzadan & Munir [15] establish that adversaries can interfere with the training process of DQNs, preventing the victim from learning the correct policy. Specifically, the attacker applies minimum perturbation to the state observed by the target, so that a different action is chosen as the optimal action at the next state.…”
Section: Attacks Against Reinforcement Learningmentioning
confidence: 99%
“…DQN is trained through optimizing the loss function of equation 4 by SGD. Behzadan and Munir [3] demonstrated that the function approximators of DQN are also vulnerable to adversarial example attacks. In other words, the set of all possible inputs to the approximated functionQ contains elements which cause the approximated functions to generate outputs that are different from the output of the original Q function.…”
Section: Attack Modelmentioning
confidence: 99%
“…Accordingly, Behzadan and Munir [3] divide this attack into the two phases of initialization and exploitation. The initialization phase implements processes that must be performed before the target begins interacting with the environment, which are: The exploitation phase implements the attack process and crafting adversarial inputs, such that the target DQN performs an action dictated by π * adv .…”
Section: Attack Modelmentioning
confidence: 99%
See 2 more Smart Citations