Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2020
DOI: 10.1145/3394486.3403089
|View full text |Cite
|
Sign up to set email alerts
|

Malicious Attacks against Deep Reinforcement Learning Interpretations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 5 publications
0
15
0
Order By: Relevance
“…Here, we consider the top-k robustness, which requires that the set of concepts with the k highest importance scores remains invariant over small-norm adversarial perturbations. In practice, the top-k attack (Ghorbani, Abid, and Zou 2019;Slack et al 2020;Huai et al 2020b;Sarkar, Sarkar, and Balasubramanian 2020;Stergiou 2021) seeks to perturb the concept importance map by decreasing the relative importance of the k initially most important concepts. Let [D] and S x,k denote the index set of the concepts and the set of concepts that had the top k highest importance scores for sample x, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…Here, we consider the top-k robustness, which requires that the set of concepts with the k highest importance scores remains invariant over small-norm adversarial perturbations. In practice, the top-k attack (Ghorbani, Abid, and Zou 2019;Slack et al 2020;Huai et al 2020b;Sarkar, Sarkar, and Balasubramanian 2020;Stergiou 2021) seeks to perturb the concept importance map by decreasing the relative importance of the k initially most important concepts. Let [D] and S x,k denote the index set of the concepts and the set of concepts that had the top k highest importance scores for sample x, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…For the theoretical analysis, two standard victims with adversarial observations, i.e., tabular certainty equivalence learner in reinforcement learning and linear quadratic regulator in control, have been analyzed in a convex optimization problem on which global optimality, and the attack feasibility and attack cost have been provided [201]. In addition, the effectiveness of an universal adversarial attack against DRL interpretations (i.e., UADRLI) has been verified by the theoretical analysis [204], from which the attacker can add the crafted universal perturbation uniformly to the environment states in a maximum number of steps to incur minimal damage. In order to stealthily attack the DRL agents, the work in [205] has injected adversarial samples in a minimal set of critical moments while causing the most severe damage to the agent.…”
Section: • Model Inversion By Casting the Model Inversion Taskmentioning
confidence: 99%
“…If the adversary attack the DRL system with the capacity of accessing to the architecture, weight parameters of the policy and Q networks, and querying the network, we can call it as a white-box attack. Clearly, the attacker can formulate an optimization framework for the white-box setting [142], [204] and derive the optimal adversarial perturbation. Moreover, via the theoretical analysis of the attack feasibility and attack cost, the adversary can attack the DRL agent efficient and stealthily [143], [201].…”
Section: • Model Inversion By Casting the Model Inversion Taskmentioning
confidence: 99%
See 1 more Smart Citation
“…In DRL, the policy is a neural network, and its framework contains an agent that interacts with the environment as it operates and makes a decision by rewarding the desired behavior and punishing the undesired one. The continual interaction of the DRL agent with the end-user exposes it to the number of adversarial attacks [4], [7]. The assumption of having a secure environment to interact is not satisfactory in vehicular applications where misbehavior from an attacker can be lifethreatening in case of accidents in vehicles.…”
Section: Introductionmentioning
confidence: 99%