Due to the limitation of mobile robots’ understanding of the environment in local path planning tasks, the problems of local deadlock and path redundancy during planning exist in unknown and complex environments. In this paper, a novel algorithm based on the combination of a long short-term memory (LSTM) neural network, fuzzy logic control, and reinforcement learning is proposed, and uses the advantages of each algorithm to overcome the other’s shortcomings. First, a neural network model including LSTM units is designed for local path planning. Second, a low-dimensional input fuzzy logic control (FL) algorithm is used to collect training data, and a network model (LSTM_FT) is pretrained by transferring the learned method to learn the basic ability. Then, reinforcement learning is combined to learn new rules from the environments autonomously to better suit different scenarios. Finally, the fusion algorithm LSTM_FTR is simulated in static and dynamic environments, and compared to FL and LSTM_FT algorithms, respectively. Numerical simulations show that, compared to FL, LSTM_FTR can significantly improve decision-making efficiency, improve the success rate of path planning, and optimize the path length. Compared to the LSTM_FT, LSTM_FTR can improve the success rate and learn new rules.
Aiming at the dimension disaster problem, poor model generalization ability and deadlock problem in special obstacles environment caused by the increase of state information in the local path planning process of mobile robot, this paper proposed a Double BP Q-learning algorithm based on the fusion of Double Q-learning algorithm and BP neural network. In order to solve the dimensional disaster problem, two BP neural network fitting value functions with the same network structure were used to replace the two Q value tables in Double Q-Learning algorithm to solve the problem that the Q value table cannot store excessive state information. By adding the mechanism of priority experience replay and using the parameter transfer to initialize the model parameters in different environments, it could accelerate the convergence rate of the algorithm, improve the learning efficiency and the generalization ability of the model. By designing specific action selection strategy in special environment, the deadlock state could be avoided and the mobile robot could reach the target point. Finally, the designed Double BP Q-learning algorithm was simulated and verified, and the probability of mobile robot reaching the target point in the parameter update process was compared with the Double Q-learning algorithm under the same condition of the planned path length. The results showed that the model trained by the improved Double BP Q-learning algorithm had a higher success rate in finding the optimal or sub-optimal path in the dense discrete environment, besides, it had stronger model generalization ability, fewer redundant sections, and could reach the target point without entering the deadlock zone in the special obstacles environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.