2022 China Automation Congress (CAC) 2022
DOI: 10.1109/cac57257.2022.10055364
|View full text |Cite
|
Sign up to set email alerts
|

A Robust Motion Planning Algorithm Based on Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…In terms of itinerary planning, Zhou Xiao [20] proposed a tourism route planning method that comprehensively considers distance, the number of transfers between public transportation systems, and average road conditions based on a dynamic programming algorithm. Xu Ke [21] proposed a tourism route planning algorithm based on a reinforcement learning algorithm to avoid areas with frequent traffic accidents and traffic congestion. Huang Zebin [22] constrained tourist travel time by introducing time window coefficients into the ant colony algorithm heuristic function.…”
Section: Research On Itinerary Planning Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation
“…In terms of itinerary planning, Zhou Xiao [20] proposed a tourism route planning method that comprehensively considers distance, the number of transfers between public transportation systems, and average road conditions based on a dynamic programming algorithm. Xu Ke [21] proposed a tourism route planning algorithm based on a reinforcement learning algorithm to avoid areas with frequent traffic accidents and traffic congestion. Huang Zebin [22] constrained tourist travel time by introducing time window coefficients into the ant colony algorithm heuristic function.…”
Section: Research On Itinerary Planning Algorithmsmentioning
confidence: 99%
“…In each iteration, we start from the user-selected base node combination (line 6), select nodes to add to the current combination, and perform itinerary planning until inserting a scenic spot would result in exceeding the cost limit, total time limit, or inaccessible scenic spots (lines 7-15). Then, the successful itinerary from the previous iteration is recorded as the recommendation for the current iteration, and itinerary-related parameters are calculated and propagated back to relevant candidate nodes (lines [16][17][18][19][20][21][22][23][24][25][26][27]. Finally, the recommendation with the maximum objective function among these itinerary recommendations is returned (line 28).…”
Section: Tour Itinerary Recommendation Algorithm Based On Tourist Com...mentioning
confidence: 99%
“…Zhou Jibiao, Zhang Haisu et al [1,2] Risks in the surrounding areas of COVID-19 were not considered Ma Chang-xi et al [3] Other modes of transportation are not considered Tu Qiang et al [4] Only the car network is considered Jia Fuqiang et al [5] The objective function is relatively simple Subramani et al [6] The risk avoidance path has a high probability of error Liping Fu et al [7] Models and algorithms to be expanded A. Khani et al [8] Risk factors in travel are not considered Xu Ke, Liu Sijia, Luo Fei et al [9][10][11] The algorithm and model need to be further improved Wang Keyin et al [12] The model is not suitable for urban traffic path planning Wang A. et al [13] Different preferences and requirements of passengers are not considered Levy S. et al [14] The setting of reward function needs further improvement Therefore, we use the SUMO simulator to build the actual road network model and design a method to extract the road network impedance matrix, which greatly improves the efficiency and accuracy of road network modeling. We have established an enhanced learning path planning model in the context of urban traffic, and designed a search mechanism to avoid risk related areas of the new epidemic.…”
Section: Author Limitationsmentioning
confidence: 99%
“…We used the RRL-APF algorithm to conduct simulation experiments for up to 300 times of learning, and the process of an agent from the starting point to the end point through exploration is defined as a complete learning. In order to verify the superiority of the RRL-APF algorithm in terms of convergence speed and other aspects, the RRL-APF algorithm was compared with the Q-Learning [27] algorithm, the Sarsa [28] algorithm and the RLAPF [9,12] algorithm under the same starting and end point, and epidemic risk location information.…”
Section: Algorithm Verificationmentioning
confidence: 99%
See 1 more Smart Citation