2020 57th ACM/IEEE Design Automation Conference (DAC) 2020
DOI: 10.1109/dac18072.2020.9218663
|View full text |Cite
|
Sign up to set email alerts
|

TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning

Abstract: We present TrojDRL, a tool for exploring and evaluating backdoor attacks on deep reinforcement learning agents. TrojDRL exploits the sequential nature of deep reinforcement learning (DRL) and considers different gradations of threat models. We show that untargeted attacks on state-of-the-art actor-critic algorithms can circumvent existing defenses built on the assumption of backdoors being targeted. We evaluated TrojDRL on a broad set of DRL benchmarks and showed that the attacks require only poisoning as litt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
57
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 51 publications
(58 citation statements)
references
References 14 publications
1
57
0
Order By: Relevance
“…Unlike classification, in DRL, data is generated by an agent interacting with the environment. TrojDRL (Kiourti et al, 2020) proposes data poisoning attacks on DRL agents by altering the observations, after they have been generated by the environment, under a man-in-the-middle (MITM) attack model. More precisely, the observations are altered by a third party, before arriving at the agent to be processed.…”
Section: Trojans In Reinforcement Learning Agentsmentioning
confidence: 99%
See 3 more Smart Citations
“…Unlike classification, in DRL, data is generated by an agent interacting with the environment. TrojDRL (Kiourti et al, 2020) proposes data poisoning attacks on DRL agents by altering the observations, after they have been generated by the environment, under a man-in-the-middle (MITM) attack model. More precisely, the observations are altered by a third party, before arriving at the agent to be processed.…”
Section: Trojans In Reinforcement Learning Agentsmentioning
confidence: 99%
“…In this work, we consider an attack model where the DRL environments generate poisoned observations. Our contributions are two-fold: 1) Our proposed method of training agents with triggered behavior is simpler than the algorithms outlined by Kiourti et al (2020), cast as a multitask learning problem, which is an approach that, to the best of our knowledge, has not been explored by other published methods for this purpose, and 2) It enables further research into triggers which may not be as easily supported by the MITM attack model, such as triggers that may emerge from multiple agents interacting in the environment.…”
Section: Trojans In Reinforcement Learning Agentsmentioning
confidence: 99%
See 2 more Smart Citations
“…There are backdoor attacks against other tasks or paradigms, such as Refs. [20][21][22] in the area of natural language processing, Refs. [23,24]in reinforcement learning, Refs.…”
Section: Introductionmentioning
confidence: 99%