2022
DOI: 10.48550/arxiv.2203.05774
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reinforcement Learning for Linear Quadratic Control is Vulnerable Under Cost Manipulation

Abstract: In this work, we study the deception of a Linear-Quadratic-Gaussian (LQG) agent by manipulating the cost signals. We show that a small falsification on the cost parameters will only lead to a bounded change in the optimal policy and the bound is linear on the amount of falsification the attacker can apply on the cost parameters. We propose an attack model where the goal of the attacker is to mislead the agent into learning a 'nefarious' policy with intended falsification on the cost parameters. We formulate th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…Understanding these attacks and developing countermeasures to guarantee the safety of the systems are critical [12]. Previous works have explored policy poisoning attacks occurring via manipulation of the system's cost measurements in a discrete-time control system [13], [14]. Our work focuses on the case of policy poisoning through sensor data manipulation on a batch-learning agent in the continuous-time setting, which we believe to be a more realistic and applicable attack scenario.…”
Section: A Related Workmentioning
confidence: 99%
“…Understanding these attacks and developing countermeasures to guarantee the safety of the systems are critical [12]. Previous works have explored policy poisoning attacks occurring via manipulation of the system's cost measurements in a discrete-time control system [13], [14]. Our work focuses on the case of policy poisoning through sensor data manipulation on a batch-learning agent in the continuous-time setting, which we believe to be a more realistic and applicable attack scenario.…”
Section: A Related Workmentioning
confidence: 99%