Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods

Leung, Karen; Aréchiga, Nikos; Pavone, Marco

doi:10.1177/02783649221082115

Cited by 13 publications

(20 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We briefly summarize Eq. ( 1) here, but refer readers to Leung et al [17] (Section 2.2) for a pedagogical introduction to STL. The core of an STL formula are predicates µ c of the form µ(z) > c, where c ∈ R and µ : R n → R is a differentiable function.…”

Section: Signal Temporal Logic (Stl)mentioning

confidence: 99%

“…By construction, every STL formula admits a robustness formula measuring the degree of rule satisfaction, which is used as the guide J . Since it is necessary to compute a gradient through this function, STL formulas are implemented using differentiable frameworks [17], [18]. Multi-agent guidance.…”

Section: : While Not Done Domentioning

confidence: 99%

“…Concretely, we use this measure of rule satisfaction as the objective function for guiding the diffusion model (Fig. 1, right) by leveraging differentiable frameworks [17], [18] to make STL compatible with guidance. Since our diffusion model generates trajectories independently for each agent in a scene, we further propose a joint guidance procedure for rules involving multi-agent interactions (e.g., no collisions), which simultaneously denoises all agents in the scene to mitigate interaction rule violations.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Guided Conditional Diffusion for Controllable Traffic Simulation

Zhong¹,

Rempe²,

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Controllable and realistic traffic simulation is critical for developing and verifying autonomous vehicles. Typical heuristic-based traffic models offer flexible control to make vehicles follow specific trajectories and traffic rules. On the other hand, data-driven approaches generate realistic and human-like behaviors, improving transfer from simulated to real-world traffic. However, to the best of our knowledge, no traffic model offers both controllability and realism. In this work, we develop a conditional diffusion model for controllable traffic generation (CTG) that allows users to control desired properties of trajectories at test time (e.g., reach a goal or follow a speed limit) while maintaining realism and physical feasibility through enforced dynamics. The key technical idea is to leverage recent advances from diffusion modeling and differentiable logic to guide generated trajectories to meet rules defined using signal temporal logic (STL). We further extend guidance to multi-agent settings and enable interaction-based rules like collision avoidance. CTG is extensively evaluated on the nuScenes dataset for diverse and composite rules, demonstrating improvement over strong baselines in terms of the controllability-realism tradeoff.

show abstract

Section: Signal Temporal Logic (Stl)mentioning

confidence: 99%

Section: : While Not Done Domentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Guided Conditional Diffusion for Controllable Traffic Simulation

Zhong¹,

Rempe²,

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Since the QP ( 9) is differentiable with respect to its parameters using the technique in [25], we backpropagate the gradient of the objective funtion in (19) through the QP to all parameters θ. The gradients of the STL robustness are calculated analytically and automatically using an adapted version of STLCG [26] that use the robustness in [23]. Then we update the parameters using the gradient.…”

Section: Learning Robust Controllersmentioning

confidence: 99%

Recurrent Neural Network Controllers for Signal Temporal Logic Specifications Subject to Safety Constraints

Mehdipour

Belta

2021

2021 American Control Conference (ACC)

View full text Add to dashboard Cite

In this paper, we consider the problem of learning a neural network controller for a system required to satisfy a Signal Temporal Logic (STL) specification. We exploit STL quantitative semantics to define a notion of robust satisfaction. Guaranteeing the correctness of a neural network controller, i.e., ensuring the satisfaction of the specification by the controlled system, is a difficult problem that received a lot of attention recently. We provide a general procedure to construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL. We use the BarrierNet, implemented by a differentiable Quadratic Program (dQP) with HOCBF constraints, as the last layer of the neural network controller, to guarantee the satisfaction of the STL formulas. We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller. Simulation results demonstrate that our approach ensures satisfaction and outperforms existing algorithms.

show abstract

“…In this setting, logical specifications serve primarily to help generate the reward function used by a DRL procedure; this approach is known as reward shaping. However, as we show in this paper, by equipping these enriched semantics with differentiable operators [13], [14], policy updates can be meaningfully constrained to yield a significantly more sample-efficient learning technique compared with existing reward-shaping methods.…”

Section: Introductionmentioning

confidence: 99%

Model-free Neural Lyapunov Control for Safe Robot Navigation

Xiong

Eapper

Qureshi

et al. 2022

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

This paper presents a hierarchical reinforcement learning algorithm constrained by differentiable signal temporal logic. Previous work on logic-constrained reinforcement learning consider encoding these constraints with a reward function, constraining policy updates with a sample-based policy gradient. However, such techniques oftentimes tend to be inefficient because of the significant number of samples required to obtain accurate policy gradients. In this paper, instead of implicitly constraining policy search with sample-based policy gradients, we directly constrain policy search by backpropagating through formal constraints, enabling training hierarchical policies with substantially fewer training samples. The use of hierarchical policies is recognized as a crucial component of reinforcement learning with task constraints. We show that we can stably constrain policy updates, thus enabling different levels of the policy to be learned simultaneously, yielding superior performance compared with training them separately. Experiment results on several simulated high-dimensional robot dynamics and a real-world differential drive robot (TurtleBot3) demonstrate the effectiveness of our approach on five different types of task constraints. Demo videos, code, and models can be found at our project website: https://sites.google.com/view/dscrl.

show abstract

Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods

Cited by 13 publications

References 26 publications

Guided Conditional Diffusion for Controllable Traffic Simulation

Guided Conditional Diffusion for Controllable Traffic Simulation

Recurrent Neural Network Controllers for Signal Temporal Logic Specifications Subject to Safety Constraints

Model-free Neural Lyapunov Control for Safe Robot Navigation

Contact Info

Product

Resources

About