2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9030012
|View full text |Cite
|
Sign up to set email alerts
|

Unpredictable Planning Under Partial Observability

Abstract: We study the problem of synthesizing a controller that maximizes the entropy of a partially observable Markov decision process (POMDP) subject to a constraint on the expected total reward. Such a controller minimizes the predictability of a decision-maker's trajectories while guaranteeing the completion of a task expressed by a reward function. First, we prove that a decision-maker with perfect observations can randomize its paths at least as well as a decision-maker with partial observations. Then, focusing o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 21 publications
0
7
0
Order By: Relevance
“…Most recently, the problem of obfuscating entire state trajectories from any conceivable estimator has been investigated by drawing on ideas from privacy in static settings (e.g., datasets) including differential privacy [20], [34], [35] and information theory [20], [21], [36], [37]. These works, however, sidestep complete POMDP treatments either by only increasing the state's unpredictability [22], [25] or by only degrading the measurements [21], [36], [37] (rather than a combination of the two). Furthermore, as noted in [38], POMDPs for information-averse or obfuscation problems frequently involve cost and cost-to-go functions that are not concave in the belief state, and so have mostly been avoided until recently because no satisfying (approximate) solution techniques existed.…”
Section: A Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Most recently, the problem of obfuscating entire state trajectories from any conceivable estimator has been investigated by drawing on ideas from privacy in static settings (e.g., datasets) including differential privacy [20], [34], [35] and information theory [20], [21], [36], [37]. These works, however, sidestep complete POMDP treatments either by only increasing the state's unpredictability [22], [25] or by only degrading the measurements [21], [36], [37] (rather than a combination of the two). Furthermore, as noted in [38], POMDPs for information-averse or obfuscation problems frequently involve cost and cost-to-go functions that are not concave in the belief state, and so have mostly been avoided until recently because no satisfying (approximate) solution techniques existed.…”
Section: A Related Workmentioning
confidence: 99%
“…where c k , ˜ , and g are defined in ( 5), (25), and (28). Then, the active obfuscation problem (5) is equivalent to:…”
Section: A Belief-state Mdp Reformulationsmentioning
confidence: 99%
See 1 more Smart Citation
“…a dynamical system) is particularly challenging since system inputs and/or outputs from isolated time instances have the potential to reveal information about the entire state trajectory through correlations introduced by the system dynamics. The design of both controllers [1]- [4], [9], [14] and output filters [8], [13] that limit the disclosure of dynamical system state information through inputs and/or outputs has therefore attracted considerable recent attention (see also [15], [16] and references therein). Despite these efforts, few works have addressed the problem of how best to control a system to conceal its entire state trajectory from an adversary that employs a Bayesian smoother for state trajectory estimation.…”
Section: Introductionmentioning
confidence: 99%
“…Whilst many works on covert motion planning have been developed in the context of robot navigation with deterministic descriptions of adversary sensing capabilities (e.g., sensor field of view) [21]- [23], covert motion planning is increasingly being considered with approaches inspired by information-theoretic privacy. For example, information-theoretic approaches that seek to make the future trajectory or policy of an agent difficult to predict and/or infer have been proposed in [3], [4], [24] on the basis of maximising the control policy entropy, and in [1] on the basis of Fisher information. One of the few approaches that explicitly seeks to conceal an agent's entire trajectory is proposed in [2] and involves minimising the total probability mass (over a specific time horizon) that the state estimates at individual times from a Bayesian filter have at the true states.…”
Section: Introductionmentioning
confidence: 99%