“…When it is illegal, the buffer action will be performed, and the SV legality is ensured by the buffer action. Besides, since the target positions of RL and backup policy meet (12), the backup policy always can find an available timestamp to maneuver before the buffer runs out.…”
Section: A Law-violence Forecastermentioning
confidence: 99%
“…denote the ego vehicle starting tangent and desired ending tangent, respectively. [12] The long-term action is a combination of p(t) and environment transition acquired from the prediction module.…”
Self-driving vehicles have their own intelligence to drive on open roads. However, vehicle managers, e.g., government or industrial companies, still need a way to tell these self-driving vehicles what behaviors are encouraged or forbidden. Unlike human drivers, current self-driving vehicles cannot understand the traffic laws, thus rely on the programmers manually writing the corresponding principles into the driving systems. It would be less efficient and hard to adapt some temporary traffic laws, especially when the vehicles use data-driven decision-making algorithms. Besides, current self-driving vehicle systems rarely take traffic law modification into consideration. This work aims to design a road traffic law adaptive decision-making method. The decision-making algorithm is designed based on reinforcement learning, in which the traffic rules are usually implicitly coded in deep neural networks. The main idea is to supply the adaptability to traffic laws of self-driving vehicles by a law-adaptive backup policy. In this work, the natural language-based traffic laws are first translated into a logical expression by the Linear Temporal Logic method. Then, the system will try to monitor in advance whether the self-driving vehicle may break the traffic laws by designing a long-term RL action space. Finally, a sample-based planning method will re-plan the trajectory when the vehicle may break the traffic rules. The method is validated in a Beijing Winter Olympic Lane scenario and an overtaking case, built in CARLA simulator. The results show that by adopting this method, the self-driving vehicles can comply with new issued or updated traffic laws effectively. This method helps self-driving vehicles governed by digital traffic laws, which is necessary for the wide adoption of autonomous driving.
“…When it is illegal, the buffer action will be performed, and the SV legality is ensured by the buffer action. Besides, since the target positions of RL and backup policy meet (12), the backup policy always can find an available timestamp to maneuver before the buffer runs out.…”
Section: A Law-violence Forecastermentioning
confidence: 99%
“…denote the ego vehicle starting tangent and desired ending tangent, respectively. [12] The long-term action is a combination of p(t) and environment transition acquired from the prediction module.…”
Self-driving vehicles have their own intelligence to drive on open roads. However, vehicle managers, e.g., government or industrial companies, still need a way to tell these self-driving vehicles what behaviors are encouraged or forbidden. Unlike human drivers, current self-driving vehicles cannot understand the traffic laws, thus rely on the programmers manually writing the corresponding principles into the driving systems. It would be less efficient and hard to adapt some temporary traffic laws, especially when the vehicles use data-driven decision-making algorithms. Besides, current self-driving vehicle systems rarely take traffic law modification into consideration. This work aims to design a road traffic law adaptive decision-making method. The decision-making algorithm is designed based on reinforcement learning, in which the traffic rules are usually implicitly coded in deep neural networks. The main idea is to supply the adaptability to traffic laws of self-driving vehicles by a law-adaptive backup policy. In this work, the natural language-based traffic laws are first translated into a logical expression by the Linear Temporal Logic method. Then, the system will try to monitor in advance whether the self-driving vehicle may break the traffic laws by designing a long-term RL action space. Finally, a sample-based planning method will re-plan the trajectory when the vehicle may break the traffic rules. The method is validated in a Beijing Winter Olympic Lane scenario and an overtaking case, built in CARLA simulator. The results show that by adopting this method, the self-driving vehicles can comply with new issued or updated traffic laws effectively. This method helps self-driving vehicles governed by digital traffic laws, which is necessary for the wide adoption of autonomous driving.
“…Combining two optimization methods also has been studied [39]. Recently, with the development of deep learning, studies on the path planning using the RL have mainly been proposed [3], [6], [7], [9], [10], [11], [14], [15], [16], [17], [40], [41], [42]. They have supposed the specific scenario and set an environment to apply the agent in the path planning.…”
Section: Path Planningmentioning
confidence: 99%
“…It has been widely used in various fields such as robotics [1], [2], [3], drone [4], [5], [6], [7], [8], [9], military service [10], [11], and self-driving car [12], [13]. Recently, reinforcement learning (RL) has been mainly studied for the path planning [3], [7], [9], [10], [11], [14], [15], [16], [17]. To get an optimal solution, it is essential to give enough reward for an agent to reach the goal and to set up a specific environment.…”
The aim of path planning is to reach the goal from starting point by searching for an agent's route. In the path planning, the routes may vary depending on the number of variables such that it is important for the agent to reach various goals. Numerous studies, however, have dealt with a single goal that is predefined by the user. In the present study, I propose a novel reinforcement learning framework for a fully controllable agent in the path planning. To do this, I propose a bi-directional memory editing to obtain various bi-directional trajectories of the agent, in which the agent's behavior and sub-goals are trained on the goal-conditioned RL. As for the agent's to move in various directions, I utilize the sub-goals dedicated network, separated from a policy network. Lastly, I present the reward shaping to shorten the number of steps for the agent to reach the goal. In the experimental result, the agent was able to reach the various goals that have never been visited by the agent in the training. We confirmed that the agent could perform difficult missions such as a round trip and the agent used the shorter route with the reward shaping.
“…However, the real-world scenarios are usually "long-tail" distributed [5], leading to low model performance in the data-sparse cases [6] [7]. It is because the model "lacks knowledge" about the environment due to insufficient data, also described as high model uncertainty [8] [9]. As a result, the downstream trajectory planner may make risky decisions in "long-tail" cases.…”
A typical trajectory planner of autonomous driving commonly relies on predicting the future behavior of surrounding obstacles. Recently, deep learning technology has been widely adopted to design prediction models due to their impressive performance. However, such models may fail in the "long-tail" driving cases where the training data is sparse or unavailable, leading to planner failures. To this end, this work proposes a trajectory planner to consider the prediction model uncertainty arising from insufficient data for safer performance. Firstly, an ensemble network structure estimates the prediction model's uncertainty due to insufficient training data. Then a trajectory planner is designed to consider the worst-case arising from prediction uncertainty. The results show that the proposed method can improve the safety of trajectory planning under the prediction uncertainty caused by insufficient data. At the same time, with sufficient data, the framework will not lead to overly conservative results. This technology helps to improve the safety and reliability of autonomous vehicles under the long-tail data distribution of the real world.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.