Deep reinforcement‐learning‐based driving policy for autonomous road vehicles

Laval

Transportation Research Record

et al. 2021

Self-driving technology companies and the research community are accelerating the pace of use of machine learning longitudinal motion planning (mMP) for autonomous vehicles (AVs). This paper reviews the current state of the art in mMP, with an exclusive focus on its impact on traffic congestion. The paper identifies the availability of congestion scenarios in current datasets, and summarizes the required features for training mMP. For learning methods, the major methods in both imitation learning and non-imitation learning are surveyed. The emerging technologies adopted by some leading AV companies, such as Tesla, Waymo, and Comma.ai, are also highlighted. It is found that: (i) the AV industry has been mostly focusing on the long tail problem related to safety and has overlooked the impact on traffic congestion, (ii) the current public self-driving datasets have not included enough congestion scenarios, and mostly lack the necessary input features/output labels to train mMP, and (iii) although the reinforcement learning approach can integrate congestion mitigation into the learning goal, the major mMP method adopted by industry is still behavior cloning, whose capability to learn a congestion-mitigating mMP remains to be seen. Based on the review, the study identifies the research gaps in current mMP development. Some suggestions for congestion mitigation for future mMP studies are proposed: (i) enrich data collection to facilitate the congestion learning, (ii) incorporate non-imitation learning methods to combine traffic efficiency into a safety-oriented technical route, and (iii) integrate domain knowledge from the traditional car-following theory to improve the string stability of mMP.

Section: Limitations In Learning Algorithmsmentioning

confidence: 99%

Review of Learning-Based Longitudinal Motion Planning for Autonomous Vehicles: Research Gaps Between Self-Driving and Traffic Congestion

Laval

Transportation Research Record

et al. 2021

2021 IEEE International Conference on Robotics and Automation (ICRA)

“…The algorithm in [12] adopted Recurrent Neural Networks for information integration, and learned an effective driving policy on simulators. The authors in [13] proposed a driving policy that makes little assumption about the environment. The work in [14] developed a realistic translation network to make sim2real possible.…”

Section: Reinforcement Learningmentioning

confidence: 99%

A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

Sun

Chen

et al. 2021

Autonomous vehicles need to handle various traffic conditions and make safe and efficient decisions and maneuvers. However, on the one hand, a single optimization/samplingbased motion planner cannot efficiently generate safe trajectories in real time, particularly when there are many interactive vehicles near by. On the other hand, end-to-end learning methods cannot assure the safety of the outcomes. To address this challenge, we propose a hierarchical behavior planning framework with a set of low-level safe controllers and a high-level reinforcement learning algorithm (H-CtRL) as a coordinator for the low-level controllers. Safety is guaranteed by the low-level optimization/sampling-based controllers, while the high-level reinforcement learning algorithm makes H-CtRL an adaptive and efficient behavior planner. To train and test our proposed algorithm, we built a simulator that can reproduce traffic scenes using real-world datasets. The proposed H-CtRL is proved to be effective in various realistic simulation scenarios, with satisfying performance in terms of both safety and efficiency.

“…Authors in [120] studied car following and lane changing behaviours of autonomous vehicles using DDDP method on VISSIM. Another RL-based autonomous driving policy is described by Makantasis et al [121] using DDQN with prioritized experience replay in mixed autonomy scenarios. Proposed deep RL-based driving policy is compared with DP-based optimal policy in different traffic densities using SUMO.…”

Section: A Autonomous Drivingmentioning

confidence: 99%

Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey

Haydari¹,

Yılmaz²

2020

Preprint

Latest technological improvements increased the quality of transportation. New data-driven approaches bring out a new research direction for all control-based systems, e.g., in transportation, robotics, IoT and power systems. Combining data-driven applications with transportation systems plays a key role in recent transportation applications. In this paper, the latest deep reinforcement learning (RL) based traffic control applications are surveyed. Specifically, traffic signal control (TSC) applications based on (deep) RL, which have been studied extensively in the literature, are discussed in detail. Different problem formulations, RL parameters, and simulation environments for TSC are discussed comprehensively. In the literature, there are also several autonomous driving applications studied with deep RL models. Our survey extensively summarizes existing works in this field by categorizing them with respect to application types, control models and studied algorithms. In the end, we discuss the challenges and open questions regarding deep RLbased transportation applications.