2020
DOI: 10.1109/lra.2020.2967299
|View full text |Cite
|
Sign up to set email alerts
|

High-Speed Autonomous Drifting With Deep Reinforcement Learning

Abstract: Drifting is a complicated task for autonomous vehicle control. Most traditional methods in this area are based on motion equations derived by the understanding of vehicle dynamics, which is difficult to be modeled precisely. We propose a robust drift controller without explicit motion equations, which is based on the latest model-free deep reinforcement learning algorithm soft actor-critic. The drift control problem is formulated as a trajectory following task, where the errorbased state and reward are designe… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
40
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 96 publications
(43 citation statements)
references
References 18 publications
0
40
0
Order By: Relevance
“…The proposed framework has been validated in a simulated environment with various velocity settings which confirms that our proposed framework has better performance compared to other frameworks. For future work, we will include action smoothing strategy [19] in our framework to make the robot can generate more smooth trajectories for higher robot's angular velocity settings in the training process.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed framework has been validated in a simulated environment with various velocity settings which confirms that our proposed framework has better performance compared to other frameworks. For future work, we will include action smoothing strategy [19] in our framework to make the robot can generate more smooth trajectories for higher robot's angular velocity settings in the training process.…”
Section: Discussionmentioning
confidence: 99%
“…To solve the second problem, several studies such as [18] and [19] include the DRL agent's output action, which is the velocity, as the magnitude for the reward function to make it able to generate high velocity value for autonomous outdoor vehicles. In the context of DRL based robot navigation task, although the agent is forced to generate higher velocity value as in [20] and [21], only small values are set for the robot's maximum velocities which prevent it from navigating quickly.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, a data-based approach for analyzing the stability of discrete-time nonlinear stochastic systems modeled by Markov decision process, by using the classic Lyapunov's method in control theory [29]. Due to the limited exploration ability caused deterministic policy, high-speed autonomous drifting is addressed, using a closed-loop controller based on the deep RL algorithm soft actor critic (SAC) to control the steering angle and throttle of simulated vehicles in [30]. We should notice a fact that deep reinforcement learning algorithms always require time-consuming training episodes.…”
Section: Introduction a Related Workmentioning
confidence: 99%
“…In our proposed method, we modify the prior work of Frans et al [14] by dividing the training process into two sequential stages for obtaining the optimal policy for each sub task and for acquiring the optimal meta policy. Moreover, we also introduce a module capable of integrating generated actions from those policies by applying the action smoothing strategy [16] which uses weighting mechanism for the current action and for the previous action so that the robot can generate smooth and safe actions. 2) We show the implementation of our proposed method in the case of person-following robot training.…”
Section: Introductionmentioning
confidence: 99%
“…3) We introduce a novel method called weight-scheduled action smoothing for performing the attending task training which does not prevent the exploration for the RL agent and is able to make the robot generate more smooth actions while attending the target person. Since smoothing the robot's actions may prevent the RL agent to find the right or the left attending goals around the target person, we modify the action smoothing strategy [16] and follow the curriculum learning strategy [17] to schedule the smoothing weights for the current action and for the previous action during the attending task training procedure.…”
Section: Introductionmentioning
confidence: 99%