A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole

Manrique-Escobar, Camilo Andrés; Pappalardo, Carmine Maria; Guida, Domenico

doi:10.3390/app10249013

Cited by 47 publications

(7 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…9(a). 31,32) The weight coefficient was fixed at +1 and the bit width of data was fixed at 8. Since this was a fulldigital implementation, the entire neural net was written in HDL, and the layout could be created only by synthesis and placement and wiring operations.…”

Section: Experimental Setup and Resultsmentioning

confidence: 99%

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

Sumikawa

Shiba

Kosuge

et al. 2023

Jpn. J. Appl. Phys.

View full text Add to dashboard Cite

We have developed a full-digital wired-logic DNN processor that is 5.3 times smaller and 2.6 times more energy efficient than previously developed processors. Our processor is capable of inferring a MNIST classification task with 90.6% accuracy and 1.2 nJ of energy per classification at 3.89 Mfps. We also developed a neuron and synapse-saving neural network using nonlinear neural network technology to reduce the number of processing elements to be implemented. Lastly, we developed a logical compression　technique toward area and energy-saving neuron cell circuits. Using these techniques, we devised a digital asynchronous wired-logic DNN processor.

show abstract

Section: Experimental Setup and Resultsmentioning

confidence: 99%

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

Sumikawa

Shiba

Kosuge

et al. 2023

Jpn. J. Appl. Phys.

View full text Add to dashboard Cite

show abstract

“…RL thereby achieves long-term results, which are otherwise very difficult to achieve. Deep RL has recently been used in robotic manipulation controllers [13,14]. A deep learning controller based on RL is also implemented in [15] for the application of DL in industrial process control.…”

Section: Output Layermentioning

confidence: 99%

Deep Learning Control for Digital Feedback Systems: Improved Performance with Robustness against Parameter Change

Alwan

Hussain

2021

Electronics

View full text Add to dashboard Cite

Training data for a deep learning (DL) neural network (NN) controller are obtained from the input and output signals of a conventional digital controller that is designed to provide the suitable control signal to a specified plant within a feedback digital control system. It is found that if the DL controller is sufficiently deep (four hidden layers), it can outperform the conventional controller in terms of settling time of the system output transient response to a unit-step reference signal. That is, the DL controller introduces a damping effect. Moreover, it does not need to be retrained to operate with a reference signal of different magnitude, or under system parameter change. Such properties make the DL control more attractive for applications that may undergo parameter variation, such as sensor networks. The promising results of robustness against parameter changes are calling for future research in the direction of robust DL control.

show abstract

“…Many numerical studies have implemented an inverted pendulum virtual environment as a benchmark to test RL algorithms [ 15 – 22 ], but to our knowledge, there is no study that provides successful RL implementations in experiments. First, except for a few studies that have discussed non ideal systems [ 16 , 17 ], most of these numerical implementations discard the effects associated to realistic (and thus more complex) control methods: in experiments, the control of the cart is subject to delay, hysteresis, biases and noise that can significantly alter the learning process. Second, most of the existing virtual environments consider only motion of the pendulum in a small angle range around the upward and unstable position and do not treat the whole control from the downward and stable position as expected in experiments.…”

Section: Introductionmentioning

confidence: 99%

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Israilov

Sánchez-Rodríguez

et al. 2023

PLoS ONE

View full text Add to dashboard Cite

Machine learning is often cited as a new paradigm in control theory, but is also often viewed as empirical and less intuitive for students than classical model-based methods. This is particularly the case for reinforcement learning, an approach that does not require any mathematical model to drive a system inside an unknown environment. This lack of intuition can be an obstacle to design experiments and implement this approach. Reversely there is a need to gain experience and intuition from experiments. In this article, we propose a general framework to reproduce successful experiments and simulations based on the inverted pendulum, a classic problem often used as a benchmark to evaluate control strategies. Two algorithms (basic Q-Learning and Deep Q-Networks (DQN)) are introduced, both in experiments and in simulation with a virtual environment, to give a comprehensive understanding of the approach and discuss its implementation on real systems. In experiments, we show that learning over a few hours is enough to control the pendulum with high accuracy. Simulations provide insights about the effect of each physical parameter and tests the feasibility and robustness of the approach.

show abstract

A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole

Cited by 47 publications

References 67 publications

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

Deep Learning Control for Digital Feedback Systems: Improved Performance with Robustness against Parameter Change

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Contact Info

Product

Resources

About

A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole

Cited by 47 publications

References 67 publications

1.2 nJ/classification 2.4 mm2 asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

1.2 nJ/classification 2.4 mm2 asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

Deep Learning Control for Digital Feedback Systems: Improved Performance with Robustness against Parameter Change

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Contact Info

Product

Resources

About

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS

1.2 nJ/classification 2.4 mm² asynchronous wired-logic DNN processor using synthesized nonlinear function blocks in 0.18 μm CMOS