This study proposes a framework for human-like autonomous car-following planning based on deep reinforcement learning (deep RL). Historical driving data are fed into a simulation environment where an RL agent learns from trial and error interactions based on a reward function that signals how much the agent deviates from the empirical data. Through these interactions, an optimal policy, or car-following model that maps in a human-like way from speed, relative speed between a lead and following vehicle, and inter-vehicle spacing to acceleration of a following vehicle is finally obtained. The model can be continuously updated when more data are fed in. Two thousand car-following periods extracted from the 2015 Shanghai Naturalistic Driving Study were used to train the model and compare its performance with that of traditional and recent datadriven car-following models. As shown by this study's results, a deep deterministic policy gradient car-following model that uses disparity between simulated and observed speed as the reward function and considers a reaction delay of 1s, denoted as DDPGvRT, can reproduce human-like car-following behavior with higher accuracy than traditional and recent data-driven car-following models. Specifically, the DDPGvRT model has a spacing validation error of 18% and speed validation error of 5%, which are less than those of other models, including the intelligent driver model, models based on locally weighted regression, and conventional neural network-based models. Moreover, the DDPGvRT demonstrates good capability of generalization to various driving situations and can adapt to different drivers by continuously learning. This study demonstrates that reinforcement learning methodology can offer insight into driver behavior and can contribute to the development of human-like autonomous driving algorithms and traffic-flow models.
Although car-following behavior is the core component of microscopic traffic simulation, intelligent transportation systems, and advanced driver assistance systems, the adequacy of the existing car-following models for Chinese drivers has not been investigated with real-world data yet. To address this gap, five representative car-following models were calibrated and evaluated for Shanghai drivers, using 2,100 urban-expressway car-following periods extracted from the 161,055 km of driving data collected in the Shanghai Naturalistic Driving Study (SH-NDS). The models were calibrated for each of the 42 subject drivers, and their capabilities of predicting the drivers' car-following behavior were evaluated.The results show that the intelligent driver model (IDM) has good transferability to model traffic situations not presented in calibration, and it performs best among the evaluated models. Compared to the Wiedemann 99 model used by VISSIM ® , the IDM is easier to calibrate and demonstrates a better and more stable performance. These advantages justify its suitability for microscopic traffic simulation tools in Shanghai and likely in other regions of China. Additionally, considerable behavioral differences among different drivers were found, which demonstrates a need for archetypes of a variety of drivers to build a traffic mix in simulation. By comparing calibrated and observed values of the IDM parameters, this study found that 1) interpretable calibrated model parameters are linked with corresponding observable parameters in real world, but they are not necessarily numerically equivalent; and 2) parameters that can be measured in reality also need to be calibrated if better trajectory reproducing capability are to be achieved.
Rear-end collisions have been estimated to account for 20 to 30 percent of all crashes, and about 10 percent of all fatal crashes. A thorough investigation of drivers' collision avoidance behaviors when exposed to rear end collision risks is needed to help guide the development of effective countermeasures. Urgency or criticality of the situation affects drivers' collision behavior, but has not been systematically investigated. A high fidelity driving simulator was used to examine the effects of differing levels of situational urgency on drivers' collision avoidance behaviors. Drivers' braking and steering decisions, perception response times, throttle release response times, throttle to brake transition times, brake delays, maximum brake pedal pressures and peak decelerations were recorded under lead vehicle decelerations of 0.3 g, 0.5 g, and 0.75 g and under headways of 1.5 s and 2.5 s. Results showed 1) as situational urgency increased, drivers released the accelerator and braked to maximum more quickly; 2) the transition time between initial throttle release and brake initiation was not affected by situational urgency; 3) at low situational urgency, multi-stage braking behavior led to longer delays from brake initiation to full braking. These findings show that effects of situational urgency on drivers' response times, braking delays, and braking intensity should be considered when developing forward collision warnings systems.
Ride comfort plays an important role in determining the public acceptance of autonomous vehicles (AVs). Many factors, such as road profile, driving speed, and suspension system, influence the ride comfort of AVs. This study proposes a hierarchical framework for improving ride comfort by integrating speed planning and suspension control in a vehicle-to-everything environment. Based on safe, comfortable, and efficient speed planning via dynamic programming, a deep reinforcement learning-based suspension control is proposed to adapt to the changing pavement conditions. Specifically, a deep deterministic policy gradient with external knowledge (EK-DDPG) algorithm is designed for the efficient selfadaptation of suspension control strategies. The external knowledge of action selection and value estimation from other AVs are combined into the loss functions of the DDPG algorithm. In numerical experiments, real-world pavements detected in 11 districts of Shanghai, China, are applied to verify the proposed method. Experimental results demonstrate that the EK-DDPG-based suspension control improves ride comfort on untrained rough pavements by 27.95% and 3.32%, compared to a model predictive control (MPC) baseline and a DDPG baseline, respectively. Meanwhile, the EK-DDPG-based suspension control improves computational efficiency by 22.97%, compared to the MPC baseline, and performs at the same level as the DDPD baseline. This study provides a generalized and computationally efficient approach for improving the ride comfort of AVs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.