2022
DOI: 10.48550/arxiv.2205.14812
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TaSIL: Taylor Series Imitation Learning

Abstract: We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control. TaSIL penalizes deviations in the higher-order Taylor series terms between the learned and expert policies. We show that experts satisfying a notion of incremental input-to-state stability are easy to learn, in the sense that a small TaSIL-augmented imitation loss over expert trajectories guarantees a small imitation loss over trajectories generated by the learned… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…However, this result does not immediately imply that π and π follow similar trajectories (errors could compound over time, causing π's trajectories to diverge from those of π ). Following the main ideas of Tu et al [46] and Pfrommer et al [31], one can in fact show guarantees in terms of trajectories when the closed-loop system under π is robust in an appropriate sense. See Appendix B the details.…”
Section: Infinite Model Classesmentioning
confidence: 96%
“…However, this result does not immediately imply that π and π follow similar trajectories (errors could compound over time, causing π's trajectories to diverge from those of π ). Following the main ideas of Tu et al [46] and Pfrommer et al [31], one can in fact show guarantees in terms of trajectories when the closed-loop system under π is robust in an appropriate sense. See Appendix B the details.…”
Section: Infinite Model Classesmentioning
confidence: 96%
“…Stability of learning-based MPC was established in [2,72] and followed, for nonlinear systems, by efforts on joint learning of the controller and(or) Lyapunov functions [13-15, 21, 22]. In addition, [64,83] have explored how learning-based control affects systems with known Lyapunov functions, [12,23,68] studied learning of stability certificates and stable controllers from data, and [6] developed a provably stable data-driven algorithm based on system measurements and prior system knowledge.…”
Section: Control Design Problems For Hyperbolic Pdes Are Hyperbolicmentioning
confidence: 99%
“…Control-theoretic methods have been explored to learn policies with stability guarantees by constraining policy and system dynamics. Taylor Series IL [27] shows that the induced trajectories of a learner and expert will be close if their derivative difference at expert states is small. However, computing highorder derivatives of the expert policy is difficult without sufficient data.…”
Section: B Imitation Learning With Robustnessmentioning
confidence: 99%