2021
DOI: 10.1109/tcds.2019.2928820
|View full text |Cite
|
Sign up to set email alerts
|

BND*-DDQN: Learn to Steer Autonomously Through Deep Reinforcement Learning

Abstract: HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des labora… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(11 citation statements)
references
References 44 publications
0
11
0
Order By: Relevance
“…Apart from average accumulated reward, we also compare the success rate of each method in this work. By following previous work [ 13 ], an episode is considered to be successful if no collision occurs within 300 steps and the success rate represents the ratio of successful episodes in 50 episodes. In order to assess the generalization capability, we tested all methods in three different kinds of virtual environments directly without any fine-tuning.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Apart from average accumulated reward, we also compare the success rate of each method in this work. By following previous work [ 13 ], an episode is considered to be successful if no collision occurs within 300 steps and the success rate represents the ratio of successful episodes in 50 episodes. In order to assess the generalization capability, we tested all methods in three different kinds of virtual environments directly without any fine-tuning.…”
Section: Resultsmentioning
confidence: 99%
“…At the same time, the next state is passed through the online network to determine optimal action and also fed into the target network to compute target Q-value of the determined action by using Equation (10). Based on the Qvalues obtained by online network and target network, we define the loss function of DRL as below: (13) In addition, the online network is also used for the proposed auxiliary task by using Equation (12). Therefore, the parameters of the multimodal representation in the online network are updated simultaneously by two-part back-propagation error and two Adam optimizers [32] are used with a same learning rate of 0.0001.…”
Section: Training Frameworkmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, deep reinforcement learning (DRL) [16] has been employed to settle navigation problem in an end-to-end manner and achieved considerable successes [17], [18], [19]. Tai et al [20] trained a DQN agent for obstacle avoidance in a simulated indoor environment.…”
Section: B Learning-based Methodsmentioning
confidence: 99%