Reverse Parking a Car-Like Mobile Robot with Deep Reinforcement Learning and Preview Control

Bejar, Eduardo; Moran, Antonio

doi:10.1109/ccwc.2019.8666613

Cited by 9 publications

(5 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Different setups for DQN-based motion planning [ 7 ] have been compared, which demonstrated its acceptable runtime performance on several devices. The deep determinist policy gradient (DDPG) [ 9 ] with preview control, which relies on a reference signal, has been proposed to solve the optimal control problem of vertical parking. DDPG [ 8 ] with manual guide exploration and different control cycle re-training pipelines has been used to achieve reactive end-to-end parking in a real vehicle platform.…”

Section: Related Workmentioning

confidence: 99%

“…For the original policy iteration, all the trajectories are used to evaluate the policy. In the field of model-free RL, the policy gradient method [ 8 , 9 ] takes the partial derivatives of the expected return of the network parameter such that the return is maximized. Inspired by this, if a new distribution p′ ( τ k ) considering the trajectory return is applied to τ , the new total expected return might be higher [ 30 ]:

…”

Section: Data-efficient Rl Algorithm Designmentioning

confidence: 99%

“…To solve these problems, data-driven reinforcement learning (RL)-based APSs have been developed [7][8][9][10]. RL includes a model-based method and model-free method [11].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System

Song

Chen

Sun

et al. 2020

Sensors

View full text Add to dashboard Cite

Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.

show abstract

Section: Related Workmentioning

confidence: 99%

…”

Section: Data-efficient Rl Algorithm Designmentioning

confidence: 99%

See 1 more Smart Citation

Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System

Song

Chen

Sun

et al. 2020

Sensors

View full text Add to dashboard Cite

show abstract

“…However, it requires all behaviors to be repeatedly sampled and the action value function to be discrete. Deep Q-Learning (DQN), proposed in [ 11 , 12 , 13 ], introduces high-dimensional perception into reinforcement learning through deep learning, combining convolutional neural networks with reinforcement learning. DQN can address discrete, low-dimensional action spaces.…”

Section: Introductionmentioning

confidence: 99%

“…The large number of actions makes it difficult to achieve an effective search and the discretization process can lead to information loss. Reference [ 13 ] proposed deterministic policy gradient (DPG) algorithms to demonstrate that deterministic gradient algorithms are more effective than stochastic gradient algorithms and prove their existence. Deep deterministic policy gradient (DDPG) algorithms are built on DPG using a depth function to approximate and learn the policy such that it can be applied in a high-dimensional, continuous action space [ 14 ].…”

Section: Introductionmentioning

confidence: 99%

Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces

Shi

Piao

et al. 2023

Sensors

View full text Add to dashboard Cite

This paper proposes a vehicle-parking trajectory planning method that addresses the issues of a long trajectory planning time and difficult training convergence during automatic parking. The process involves two stages: finding a parking space and parking planning. The first stage uses model predictive control (MPC) for trajectory tracking from the initial position of the vehicle to the starting point of the parking operation. The second stage employs the proximal policy optimization (PPO) algorithm to transform the parking behavior into a reinforcement learning process. A four-dimensional reward function is set to evaluate the strategy based on a formal reward, guiding the adjustment of neural network parameters and reducing the exploration of invalid actions. Finally, a simulation environment is built for the parking scene, and a network framework is designed. The proposed method is compared with the deep deterministic policy gradient and double-delay deep deterministic policy gradient algorithms in the same scene. Results confirm that the MPC controller accurately performs trajectory-tracking control with minimal steering wheel angle changes and smooth, continuous movement. The PPO-based reinforcement learning method achieves shorter learning times, totaling only 30% and 37.5% of the deep deterministic policy gradient (DDPG) and twin-delayed deep deterministic policy gradient (TD3), and the number of iterations to reach convergence for the PPO algorithm with the introduction of the four-dimensional evaluation metrics is 75% and 68% shorter compared to the DDPG and TD3 algorithms, respectively. This study demonstrates the effectiveness of the proposed method in addressing a slow convergence and long training times in parking trajectory planning, improving parking timeliness.

show abstract

Path Planning Method for Automatic Parking Based on Hybrid A*

Ren¹,

Sun²,

Zhou³

et al. 2023

Lecture Notes in Electrical Engineering

View full text Add to dashboard Cite

Reverse Parking a Car-Like Mobile Robot with Deep Reinforcement Learning and Preview Control

Cited by 9 publications

References 9 publications

Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System

Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System

Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces

Path Planning Method for Automatic Parking Based on Hybrid A*

Contact Info

Product

Resources

About