2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) 2017
DOI: 10.1109/mfi.2017.8170388
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning algorithms for steering an underactuated ship

Abstract: Abstract-Based on state-of-the-art deep reinforcement learning (Deep RL) algorithms, two controllers are proposed to pass a ship through a specified gate. Deep RL is a powerful approach to learn a complex controller which is expected to adapt to different situations of systems. This paper explains how to apply these algorithms to ship steering problem. The simulation results show advantages of these algorithms in reproducing reliable and stable controllers.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 16 publications
0
5
0
Order By: Relevance
“…An advantage function is construct in L_model. Action space of NAF algorithm is continuous domain [20][21]. The action value is random selected by the NAF agent from the range of 0 to 50 which is the search interval range.…”
Section: Methodsmentioning
confidence: 99%
“…An advantage function is construct in L_model. Action space of NAF algorithm is continuous domain [20][21]. The action value is random selected by the NAF agent from the range of 0 to 50 which is the search interval range.…”
Section: Methodsmentioning
confidence: 99%
“…Exceptions can be found in the work of [Stamenkovich 1992], where a simple neuron-like actor-critic agent navigated a ship through a channel through two sensors signal; more recently, [Lacki 2008] and [Rak and Gierusz 2012] employed online RL for ship handling in restricted waters assuming constant speed in a broad area with small obstacles (tabular algorithms with discretized state and function approximation for continuous states space were respectively used). Finally, [Tuyen et al 2017] combined actor critic methods with neural networks to control rudder with continuous levels also using constant speed. One common aspect of those efforts is that actions taken by the agent are considered continuous in time, meaning it operates more like a controller and not much in a human style with the already described limitations.…”
Section: Previous Workmentioning
confidence: 99%
“…Since then, DRL has been successful in surpassing all previous computer programs in chess and learning how to accomplish complex robotic tasks (Silver et al, 2017;Andrychowicz et al, 2018). Given DRL's ability to tackle problems with high uncertainty, implementations to motion control scenarios involving marine vessels have been presented recently (Shen and Guo, 2016;Zhang et al, 2016;Pham Tuyen et al, 2017;Yu et al, 2017;Cheng and Zhang, 2018;Martinsen and Lekkas, 2018a,b). In most of these works the authors implemented algorithms pertaining to the class of actor-critic RL methods, which involves two parts (Konda and Tsitsiklis, 2000): The actor, where the gradient of the performance is estimated and the policy parameters are directly updated in a direction of improvement.…”
Section: Introductionmentioning
confidence: 99%