Simultaneous policy learning and latent state inference for imitating driver behavior

Laval

Transportation Research Record

et al. 2021

Self-driving technology companies and the research community are accelerating the pace of use of machine learning longitudinal motion planning (mMP) for autonomous vehicles (AVs). This paper reviews the current state of the art in mMP, with an exclusive focus on its impact on traffic congestion. The paper identifies the availability of congestion scenarios in current datasets, and summarizes the required features for training mMP. For learning methods, the major methods in both imitation learning and non-imitation learning are surveyed. The emerging technologies adopted by some leading AV companies, such as Tesla, Waymo, and Comma.ai, are also highlighted. It is found that: (i) the AV industry has been mostly focusing on the long tail problem related to safety and has overlooked the impact on traffic congestion, (ii) the current public self-driving datasets have not included enough congestion scenarios, and mostly lack the necessary input features/output labels to train mMP, and (iii) although the reinforcement learning approach can integrate congestion mitigation into the learning goal, the major mMP method adopted by industry is still behavior cloning, whose capability to learn a congestion-mitigating mMP remains to be seen. Based on the review, the study identifies the research gaps in current mMP development. Some suggestions for congestion mitigation for future mMP studies are proposed: (i) enrich data collection to facilitate the congestion learning, (ii) incorporate non-imitation learning methods to combine traffic efficiency into a safety-oriented technical route, and (iii) integrate domain knowledge from the traditional car-following theory to improve the string stability of mMP.

Section: Memory and Predictionmentioning

confidence: 99%

Review of Learning-Based Longitudinal Motion Planning for Autonomous Vehicles: Research Gaps Between Self-Driving and Traffic Congestion

Laval

Transportation Research Record

et al. 2021

“…Some works distinguish between distracted and attentive drivers for behavior prediction and cooperative planning [14], [15]. Driving style recognition has been addressed with both unsupervised and supervised learning methods, which we will discuss in detail below [3], [8], [16], [17].…”

Section: A Driver Internal State Estimationmentioning

confidence: 99%

“…Morton et al propose a method that first encodes driving trajectories with different driving styles to a latent space. Then, the latent encodings and the current driver states are fed into a feedforward policy that produces multimodal actions [8]. The encoder and the policy are optimized jointly.…”

Section: A Driver Internal State Estimationmentioning

confidence: 99%

“…To address the above problems, Morton et al learns a latent representation of driver traits, which is fed into a feedforward S. Liu, P. Chang, H. Chen, N. Chakraborty and K. Driggs-Campbell are with the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. emails: {sliu105,pchang17,haonan2,neeloyc2,krdc}@illinois.edu policy to produce multimodal behaviors [8]. However, the feedforward policy only considers current states and actions which are not sufficient to fully express long-term properties of drivers such as traits.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning to Navigate Intersections with Unsupervised Driver Trait Inference

Liu¹,

Chang²,

Chen³

et al. 2021

Preprint

Navigation through uncontrolled intersections is one of the key challenges for autonomous vehicles. Identifying the subtle differences in hidden traits of other drivers can bring significant benefits when navigating in such environments. We propose an unsupervised method for inferring driver traits such as driving styles from observed vehicle trajectories. We use a variational autoencoder with recurrent neural networks to learn a latent representation of traits without any ground truth trait labels. Then, we use this trait representation to learn a policy for an autonomous vehicle to navigate through a T-intersection with deep reinforcement learning. Our pipeline enables the autonomous vehicle to adjust its actions when dealing with drivers of different traits to ensure safety and efficiency. Our method demonstrates promising performance and outperforms state-of-the-art baselines in the T-intersection scenario.

IEEE Trans. Intell. Transport. Syst.

“…However, BC has the compounding errors [35] and requires enormous training data [36]- [38]. Even with the recent advances in deep learning techniques, BC approaches [39], [40], trained with large datasets using deep learning, still showed the compounding errors during simulations. On the contrary, IRL tries to recover the reward function followed by the experts, assuming that they follow an optimal policy.…”

Section: Introductionmentioning

confidence: 99%

A Generative Adversarial Imitation Learning Approach for Realistic Aircraft Taxi-Speed Modeling

Pham

Tran

Alam

et al. 2022

Classical approaches for modelling aircraft taxispeed assume constant speed or use a turning rate function to approximate taxi-timings for taxiing aircraft. However, those approaches cannot predict spatio-temporal component of aircrafttaxi trajectory due to a lack of consideration of the complexity and stochasticity of airport-airside movements and interactions. This research adopts the Generative Adversarial Imitation Learning (GAIL) algorithm for aircraft taxi-speed modelling, while considering multiple operational factors including surrounding traffic on the ground and target take-off time. The proposed model can learn and reproduce the ground movement patterns in a real-world dataset under different circumstances. In addition, the characteristics of the taxi-speed model are also analyzed, especially focusing on handling conflict scenarios with surrounding traffic. Finally, the travel-time of the aircraft from starting to target positions are compared with baseline models and actual taxiing data. The proposed model outperforms all the baseline models with a significant margin. In terms of spatial completion (SC), it achieves up to 97.1% for arrivals and 88.3% for departures. The results also show significantly high performance for temporal completion. The model achieves a stable performance with low Root Mean Square Error (RMSE) (16.8 seconds for arrivals, 32.4 seconds for departures) and Mean Absolute Percentage Error (MAPE) (4.4% for arrivals and 7.6% for departures). Our model's errors are 72% lower for arrivals and 48% lower for departures when compared to other baseline models.