2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9029287
|View full text |Cite
|
Sign up to set email alerts
|

Incentive Design for Temporal Logic Objectives

Abstract: We study the problem of designing an optimal sequence of incentives that a principal should offer to an agent so that the agent's optimal behavior under the incentives realizes the principal's objective expressed as a temporal logic formula. We consider an agent with a finite decision horizon and model its decision-making process as a Markov decision process (MDP). Under certain assumptions, we present a polynomialtime algorithm to synthesize an incentive sequence that minimizes the cost to the principal. We s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 12 publications
0
7
0
Order By: Relevance
“…1) Policy synthesis: The problem in ( 24) is a total undiscounted cost minimization problem subject to a constraint on the total undiscounted reward given in (20). We recently established the existence of optimal stationary deterministic policies for such problems in [30], where we also present an efficient method to synthesize optimal policies. For completeness, we provide the developed method below.…”
Section: An Approximation Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…1) Policy synthesis: The problem in ( 24) is a total undiscounted cost minimization problem subject to a constraint on the total undiscounted reward given in (20). We recently established the existence of optimal stationary deterministic policies for such problems in [30], where we also present an efficient method to synthesize optimal policies. For completeness, we provide the developed method below.…”
Section: An Approximation Algorithmmentioning
confidence: 99%
“…Proposition 4: [30] A policy π ⋆ D,app ∈Π SD (M) generated by the rule in (27) satisfies the condition in (24).…”
Section: An Approximation Algorithmmentioning
confidence: 99%
“…A proof of Proposition 1 can be found in [16]. Intuitively, the LP in (21a)-(21c) computes the minimum expected time to reach the set B with probability R max (M, B) and cost υ .…”
Section: A Behavior Modification Of a Dominant Type: Formulationmentioning
confidence: 99%
“…Consider an agent with a finite decision horizon N ∈N whose objective is to maximize the expected total reward it collects at the end of every N stages [16]. Such an agent's optimal policy is a sequence (D 1 , D 2 , .…”
Section: Appendix Amentioning
confidence: 99%
See 1 more Smart Citation