Deep reinforcement learning algorithms for steering an underactuated ship

Tuyen, Le Pham; Layek, Md. Abu; Vien, Ngo Anh; Chung, T. J.

doi:10.1109/mfi.2017.8170388

Cited by 11 publications

(5 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An advantage function is construct in L_model. Action space of NAF algorithm is continuous domain [20][21]. The action value is random selected by the NAF agent from the range of 0 to 50 which is the search interval range.…”

Section: Methodsmentioning

confidence: 99%

Spike neuron optimization using deep reinforcement learning

Hui

Ishak

2021

IJ-AI

View full text Add to dashboard Cite

Deep reinforcement learning (DRL) which involved reinforcement learning and artificial neural network allows agents to take the best possible actions to achieve goals. Spiking Neural Network (SNN) faced difficulty in training due to the non-differentiable spike function of spike neuron. In order to overcome the difficulty, Deep Q network (DQN) and Deep Q learning with normalized advantage function (NAF) are proposed to interact with a custom environment. DQN is applied for discrete action space whereas NAF is implemented for continuous action space. The model is trained and tested to validate its performance in order to balance the firing rate of excitatory and inhibitory population of spike neuron by using both algorithms. Training results showed both agents able to explore in the custom environment with OpenAI Gym framework. The trained model for both algorithms capable to balance the firing rate of excitatory and inhibitory of the spike neuron. NAF achieved 0.80% of the average percentage error of rate of difference between target and actual neuron rate whereas DQN obtained 0.96%. NAF attained the goal faster than DQN with only 3 steps taken for actual output neuron rate to meet with or close to target neuron firing rate.

show abstract

Section: Methodsmentioning

confidence: 99%

Spike neuron optimization using deep reinforcement learning

Hui

Ishak

2021

IJ-AI

View full text Add to dashboard Cite

show abstract

“…Exceptions can be found in the work of [Stamenkovich 1992], where a simple neuron-like actor-critic agent navigated a ship through a channel through two sensors signal; more recently, [Lacki 2008] and [Rak and Gierusz 2012] employed online RL for ship handling in restricted waters assuming constant speed in a broad area with small obstacles (tabular algorithms with discretized state and function approximation for continuous states space were respectively used). Finally, [Tuyen et al 2017] combined actor critic methods with neural networks to control rudder with continuous levels also using constant speed. One common aspect of those efforts is that actions taken by the agent are considered continuous in time, meaning it operates more like a controller and not much in a human style with the already described limitations.…”

Section: Previous Workmentioning

confidence: 99%

Batch Reinforcement Learning of Feasible Trajectories in a Ship Maneuvering Simulator

Amendola¹,

Tannuri²,

Cozman³

et al. 2018

Anais Do XV Encontro Nacional De Inteligência Artificial E Computacional (ENIAC 2018)

View full text Add to dashboard Cite

Ship control in port channels is a challenging problem that has resisted automated solutions. In this paper we focus on reinforcement learning of control signals so as to steer ships in their maneuvers. The learning process uses fitted Q iteration together with a Ship Maneuvering Simulator. Domain knowledge is used to develop a compact state-space model; we show how this model and the learning process lead to ship maneuvering under difficult conditions.

show abstract

“…Since then, DRL has been successful in surpassing all previous computer programs in chess and learning how to accomplish complex robotic tasks (Silver et al, 2017;Andrychowicz et al, 2018). Given DRL's ability to tackle problems with high uncertainty, implementations to motion control scenarios involving marine vessels have been presented recently (Shen and Guo, 2016;Zhang et al, 2016;Pham Tuyen et al, 2017;Yu et al, 2017;Cheng and Zhang, 2018;Martinsen and Lekkas, 2018a,b). In most of these works the authors implemented algorithms pertaining to the class of actor-critic RL methods, which involves two parts (Konda and Tsitsiklis, 2000): The actor, where the gradient of the performance is estimated and the policy parameters are directly updated in a direction of improvement.…”

Section: Introductionmentioning

confidence: 99%

Reinforcement Learning-Based Tracking Control of USVs in Varying Operational Conditions

et al. 2020

View full text Add to dashboard Cite

We present a reinforcement learning-based (RL) control scheme for trajectory tracking of fully-actuated surface vessels. The proposed method learns online both a model-based feedforward controller, as well an optimizing feedback policy in order to follow a desired trajectory under the influence of environmental forces. The method's efficiency is evaluated via simulations and sea trials, with the unmanned surface vehicle (USV) ReVolt performing three different tracking tasks: The four corner DP test, straight-path tracking and curved-path tracking. The results demonstrate the method's ability to accomplish the control objectives and a good agreement between the performance achieved in the Revolt Digital Twin and the sea trials. Finally, we include an section with considerations about assurance for RL-based methods and where our approach stands in terms of the main challenges.

show abstract

Deep reinforcement learning algorithms for steering an underactuated ship

Cited by 11 publications

References 16 publications

Spike neuron optimization using deep reinforcement learning

Spike neuron optimization using deep reinforcement learning

Batch Reinforcement Learning of Feasible Trajectories in a Ship Maneuvering Simulator

Reinforcement Learning-Based Tracking Control of USVs in Varying Operational Conditions

Contact Info

Product

Resources

About