As autonomous cars are becoming tangible technologies, road networks will soon be shared by human-driven and autonomous cars. However, humans normally act selfishly which may result in network inefficiencies. In this work, we study increasing the efficiency of mixed-autonomy traffic networks by routing autonomous cars altruistically. We consider a Stackelberg routing setting where a central planner can route autonomous cars in the favor of society such that when human-driven cars react and select their routes selfishly, the overall system efficiency is increased. We develop a Stackelberg routing strategy for autonomous cars in a mixed-autonomy traffic network with arbitrary geometry. We bound the price of anarchy that our Stackelberg strategy induces and prove that our proposed Stackelberg routing will reduce the price of anarchy, i.e. it increases the network efficiency. Specifically, we consider a non-atomic routing game in a mixed-autonomy setting with affine latency functions and develop an extension of the SCALE Stackelberg strategy for mixed-autonomy networks. We derive an upper bound on the price of anarchy that this Stackelberg routing induces and demonstrate that in the limit, our bound recovers the price of anarchy bounds for networks of only human-driven cars.
This paper proposes a data-driven method for learning convergent control policies from offline data using Contraction theory. Contraction theory enables constructing a policy that makes the closed-loop system trajectories inherently convergent towards a unique trajectory. At the technical level, identifying the contraction metric, which is the distance metric with respect to which a robot's trajectories exhibit contraction is often non-trivial. We propose to jointly learn the control policy and its corresponding contraction metric while enforcing contraction. To achieve this, we learn an implicit dynamics model of the robotic system from an offline data set consisting of the robot's state and input trajectories. Using this learned dynamics model, we propose a data augmentation algorithm for learning contraction policies. We randomly generate samples in the state-space and propagate them forward in time through the learned dynamics model to generate auxiliary sample trajectories. We then learn both the control policy and the contraction metric such that the distance between the trajectories from the offline data set and our generated auxiliary sample trajectories decreases over time. We evaluate the performance of our proposed framework on simulated robotic goal-reaching tasks and demonstrate that enforcing contraction results in faster convergence and greater robustness of the learned policy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.