Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Behzadan, Vahid; Munir, Arslan

doi:10.1007/978-3-319-62416-7_19

Cited by 179 publications

(160 citation statements)

References 22 publications

Supporting

Mentioning

158

Contrasting

Order By: Relevance

“…Behzadan & Munir [15] establish that adversaries can interfere with the training process of DQNs, preventing the victim from learning the correct policy. Specifically, the attacker applies minimum perturbation to the state observed by the target, so that a different action is chosen as the optimal action at the next state.…”

Section: Attacks Against Reinforcement Learningmentioning

confidence: 99%

Reinforcement Learning for Autonomous Defence in Software-Defined Networking

Han

Rubinstein

Abraham

et al. 2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Despite the successful application of machine learning (ML) in a wide range of domains, adaptabilitythe very property that makes machine learning desirable-can be exploited by adversaries to contaminate training and evade classification. In this paper, we investigate the feasibility of applying a specific class of machine learning algorithms, namely, reinforcement learning (RL) algorithms, for autonomous cyber defence in software-defined networking (SDN). In particular, we focus on how an RL agent reacts towards different forms of causative attacks that poison its training process, including indiscriminate and targeted, white-box and black-box attacks. In addition, we also study the impact of the attack timing, and explore potential countermeasures such as adversarial training.

show abstract

Section: Attacks Against Reinforcement Learningmentioning

confidence: 99%

Reinforcement Learning for Autonomous Defence in Software-Defined Networking

Han

Rubinstein

Abraham

et al. 2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…DQN is trained through optimizing the loss function of equation 4 by SGD. Behzadan and Munir [3] demonstrated that the function approximators of DQN are also vulnerable to adversarial example attacks. In other words, the set of all possible inputs to the approximated functionQ contains elements which cause the approximated functions to generate outputs that are different from the output of the original Q function.…”

Section: Attack Modelmentioning

confidence: 99%

“…Accordingly, Behzadan and Munir [3] divide this attack into the two phases of initialization and exploitation. The initialization phase implements processes that must be performed before the target begins interacting with the environment, which are: The exploitation phase implements the attack process and crafting adversarial inputs, such that the target DQN performs an action dictated by π * adv .…”

Section: Attack Modelmentioning

confidence: 99%

“…While the interest in deep RL solutions is extending into numerous domains such as intelligent transportation systems [1], finance [6] and critical infrastructure [15], ensuring the security and reliability of such solutions in adversarial conditions is only at its preliminary stages. Recently, Behzadan and Munir [3] reported the vulnerability of deep reinforcement learning algorithms to both test-time and training-time attacks using adversarial examples [9]. This work was followed by a number of further investigations (e.g., [11], [12]), verifying the fragility of deep RL agents to such attacks.…”

mentioning

confidence: 99%

“…To this end, we evaluate the performance of Deep Q-Network (DQN) models trained with parameter noise, against the test-time and training-time adversarial example attacks introduced in [3]. Main contributions of this work are: The remainder of this paper is organized as follows: Section 1 reviews the relevant background of DQN, parameter noise training via the NoisyNet approach, and adversarial examples.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise

Behzadan

Munir

2018

Developments in Language Theory

Self Cite

View full text Add to dashboard Cite

Recent developments have established the vulnerability of deep reinforcement learning to policy manipulation attacks via intentionally perturbed inputs, known as adversarial examples. In this work, we propose a technique for mitigation of such attacks based on addition of noise to the parameter space of deep reinforcement learners during training. We experimentally verify the effect of parameter-space noise in reducing the transferability of adversarial examples, and demonstrate the promising performance of this technique in mitigating the impact of whitebox and blackbox attacks at both test and training times.

show abstract