Investigating High-Level Decision Making for Automated Driving

Capello, Alessio; Forneris, Luca; Pighetti, Alessandro; Bellotti, Francesco; Lazzaroni, Luca; Cossu, Marianna; Gloria, Alessandro De; Berta, Riccardo

doi:10.1007/978-3-031-30333-3_41

Cited by 2 publications

(1 citation statement)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Motivated by the shortcomings of other popular policy gradients algorithms, such as Trust Region Policy Optimization (TRPO) [41] and A2C [42], which suffered from training stability issues and slow policy convergence, PPO introduces smaller policy update steps and a clipped objective function to ensure stable, generalized and efficient learning. Thanks to these features and its implementation simplicity, PPO has quickly gained popularity in a wide range of DRL-related research fields, including game-playing and robotic control [43]- [45].…”

Section: Reinforcement Learning and Highway-env Development Environmentmentioning

confidence: 99%

Implementing Deep Reinforcement Learning (DRL)-based Driving Styles for Non-Player Vehicles

Forneris,

Pighetti,

Lazzaroni

et al. 2023

IJSG

View full text Add to dashboard Cite

We propose a new, hierarchical architecture for behavioral planning of vehicle models usable as realistic non-player vehicles in serious games related to traffic and driving. These agents, trained with deep reinforcement learning (DRL), decide their motion by taking high-level decisions, such as “keep lane”, “overtake” and “go to rightmost lane”. This is similar to a driver’s high-level reasoning and takes into account the availability of advanced driving assistance systems (ADAS) in current vehicles. Compared to a low-level decision making system, our model performs better both in terms of safety and speed. As a significant advantage, the proposed approach allows to reduce the number of training steps by more than one order of magnitude. This makes the development of new models much more efficient, which is key for implementing vehicles featuring different driving styles. We also demonstrate that, by simply tweaking the reinforcement learning (RL) reward function, it is possible to train agents characterized by different driving behaviors. We also employed the continual learning technique, starting the training procedure of a more specialized agent from a base model. This allowed significantly to reduce the number of training steps while keeping similar vehicular performance figures. However, the characteristics of the specialized agents are deeply influenced by the characteristics of the baseline agent.

show abstract

Section: Reinforcement Learning and Highway-env Development Environmentmentioning

confidence: 99%