Confidence-Aware Reinforcement Learning for Self-Driving Cars

Cao, Zhong; Xu, Shaobing; Peng, Huei; Yang, Diange; Zidek, Robert A. E.

doi:10.1109/tits.2021.3069497

Cited by 41 publications

(15 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When it is illegal, the buffer action will be performed, and the SV legality is ensured by the buffer action. Besides, since the target positions of RL and backup policy meet (12), the backup policy always can find an available timestamp to maneuver before the buffer runs out.…”

Section: A Law-violence Forecastermentioning

confidence: 99%

See 1 more Smart Citation

Road Traffic Law Adaptive Decision-making for Self-Driving Vehicles

Liu¹,

Zhou²,

Wang³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Self-driving vehicles have their own intelligence to drive on open roads. However, vehicle managers, e.g., government or industrial companies, still need a way to tell these self-driving vehicles what behaviors are encouraged or forbidden. Unlike human drivers, current self-driving vehicles cannot understand the traffic laws, thus rely on the programmers manually writing the corresponding principles into the driving systems. It would be less efficient and hard to adapt some temporary traffic laws, especially when the vehicles use data-driven decision-making algorithms. Besides, current self-driving vehicle systems rarely take traffic law modification into consideration. This work aims to design a road traffic law adaptive decision-making method. The decision-making algorithm is designed based on reinforcement learning, in which the traffic rules are usually implicitly coded in deep neural networks. The main idea is to supply the adaptability to traffic laws of self-driving vehicles by a law-adaptive backup policy. In this work, the natural language-based traffic laws are first translated into a logical expression by the Linear Temporal Logic method. Then, the system will try to monitor in advance whether the self-driving vehicle may break the traffic laws by designing a long-term RL action space. Finally, a sample-based planning method will re-plan the trajectory when the vehicle may break the traffic rules. The method is validated in a Beijing Winter Olympic Lane scenario and an overtaking case, built in CARLA simulator. The results show that by adopting this method, the self-driving vehicles can comply with new issued or updated traffic laws effectively. This method helps self-driving vehicles governed by digital traffic laws, which is necessary for the wide adoption of autonomous driving.

show abstract

Section: A Law-violence Forecastermentioning

confidence: 99%

“…denote the ego vehicle starting tangent and desired ending tangent, respectively. [12] The long-term action is a combination of p(t) and environment transition acquired from the prediction module.…”

Section: Rule-based Backup Policymentioning

confidence: 99%

Road Traffic Law Adaptive Decision-making for Self-Driving Vehicles

Liu¹,

Zhou²,

Wang³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Combining two optimization methods also has been studied [39]. Recently, with the development of deep learning, studies on the path planning using the RL have mainly been proposed [3], [6], [7], [9], [10], [11], [14], [15], [16], [17], [40], [41], [42]. They have supposed the specific scenario and set an environment to apply the agent in the path planning.…”

Section: Path Planningmentioning

confidence: 99%

“…It has been widely used in various fields such as robotics [1], [2], [3], drone [4], [5], [6], [7], [8], [9], military service [10], [11], and self-driving car [12], [13]. Recently, reinforcement learning (RL) has been mainly studied for the path planning [3], [7], [9], [10], [11], [14], [15], [16], [17]. To get an optimal solution, it is essential to give enough reward for an agent to reach the goal and to set up a specific environment.…”

Section: Introductionmentioning

confidence: 99%

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

Lee¹

2022

Preprint

View full text Add to dashboard Cite

The aim of path planning is to reach the goal from starting point by searching for an agent's route. In the path planning, the routes may vary depending on the number of variables such that it is important for the agent to reach various goals. Numerous studies, however, have dealt with a single goal that is predefined by the user. In the present study, I propose a novel reinforcement learning framework for a fully controllable agent in the path planning. To do this, I propose a bi-directional memory editing to obtain various bi-directional trajectories of the agent, in which the agent's behavior and sub-goals are trained on the goal-conditioned RL. As for the agent's to move in various directions, I utilize the sub-goals dedicated network, separated from a policy network. Lastly, I present the reward shaping to shorten the number of steps for the agent to reach the goal. In the experimental result, the agent was able to reach the various goals that have never been visited by the agent in the training. We confirmed that the agent could perform difficult missions such as a round trip and the agent used the shorter route with the reward shaping.

show abstract

“…However, the real-world scenarios are usually "long-tail" distributed [5], leading to low model performance in the data-sparse cases [6] [7]. It is because the model "lacks knowledge" about the environment due to insufficient data, also described as high model uncertainty [8] [9]. As a result, the downstream trajectory planner may make risky decisions in "long-tail" cases.…”

Section: Introductionmentioning

confidence: 99%

Long-Tail Prediction Uncertainty Aware Trajectory Planning for Self-driving Vehicles

Zhou¹,

Cao²,

Deng³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

A typical trajectory planner of autonomous driving commonly relies on predicting the future behavior of surrounding obstacles. Recently, deep learning technology has been widely adopted to design prediction models due to their impressive performance. However, such models may fail in the "long-tail" driving cases where the training data is sparse or unavailable, leading to planner failures. To this end, this work proposes a trajectory planner to consider the prediction model uncertainty arising from insufficient data for safer performance. Firstly, an ensemble network structure estimates the prediction model's uncertainty due to insufficient training data. Then a trajectory planner is designed to consider the worst-case arising from prediction uncertainty. The results show that the proposed method can improve the safety of trajectory planning under the prediction uncertainty caused by insufficient data. At the same time, with sufficient data, the framework will not lead to overly conservative results. This technology helps to improve the safety and reliability of autonomous vehicles under the long-tail data distribution of the real world.

show abstract

Confidence-Aware Reinforcement Learning for Self-Driving Cars

Cited by 41 publications

References 20 publications

Road Traffic Law Adaptive Decision-making for Self-Driving Vehicles

Road Traffic Law Adaptive Decision-making for Self-Driving Vehicles

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

Long-Tail Prediction Uncertainty Aware Trajectory Planning for Self-driving Vehicles

Contact Info

Product

Resources

About