Reinforcement Learning for Position Control Problem of a Mobile Robot

Farías, Gonzalo; Garcia, Gonzalo; Montenegro, Guelis; Fábregas, Ernesto; Dormido-Canto, S.; Dormido, Sebastián

doi:10.1109/access.2020.3018026

Cited by 18 publications

(12 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Estos criterios de desempeño están basados en el error acumulativo y se pueden aplicar al seguimiento de trayectorias de referencia indicando el error durante todo el recorrido entre la trayectoria de referencia y la trayectoria real realizada por el robot. Estos índices también son usados para control de posición, distancia, orientación, formación de múltiples robots, etc (Caruntu et al, 2019), (Farias et al, 2020). Entre menor sea el error, mejor será la trayectoria recorrida, en consecuencia, mejor será el algoritmo de control.…”

Section: Error Finalunclassified

“…Criterio: índice, métrica, medición, procedimiento Referencia % éxito en alcanzar la meta (Gridnev et al, 2017) (Graba et al, 2020) Integral del valor absoluto del error -integral absolute error (IAE) (Fernandes et al, 2017); (Caruntu et al, 2019); (Suarin et al, 2019); (Rivera et al, 2020); (Farias et al, 2020) Integral del valor absoluto del error ponderado en el tiempo -Integral time-weighted absolute error (ITAE) (Fernandes et al, 2017); (Suarin et al, 2019); (Farias et al, 2020) Integral del valor absoluto de la señal de control -Integral absolute signal control (IASC) (Fernandes et al, 2017) Integral del cuadrado del error -integral square error (ISE) (Perez et al, 2018); (Suarin et al, 2019); (Farias et al, 2020) integral del cuadrado del error por el tiempo -Integral time square error (ITSE) (Farias et al, 2020) Esfuerzo de control (Rivera et al, 2020) Seguridad (Marvel and Bostelman, 2014); (Munoz et al, 2014) Energía (Munoz et al, 2014); (Fernandes et al, 2017); (Stefek et al, 2020); (Graba et al, 2020); (Wei and Isler, 2020); (Hrubý et al, 2021)…”

Section: Tbe Se Define Comounclassified

See 1 more Smart Citation

Criterios de desempeño para evaluar algoritmos de navegación de robots móviles: una revisión

Ceballos

Suarez-Rivera

2022

Rev. iberoam. autom. inform. ind.

View full text Add to dashboard Cite

En este artículo se presenta una revisión de literatura sobre criterios de desempeño para evaluar la navegación de un robot móvil, los cuales ayudan a comparar cuantitativamente diferentes características, como: el sistema de control, la navegación en diferentes entornos de trabajo, el desempeño energético, etc. El interés en criterios de desempeño y procedimiento de comparación (benchmarks) ha crecido en los últimos años, principalmente por investigadores y fabricantes de robots que buscan satisfacer la creciente demanda de aplicaciones en el mercado global, cada vez más competido. El conjunto de criterios está compuesto por métricas, índices, mediciones y benchmarks, desde el más básico como contabilizar el éxito en alcanzar la meta, pasando por otros más elaborados como los de seguridad en la trayectoria generada en la evasión de obstáculos, hasta criterios que comparan aspectos más complejos de la navegación como el consumo energético. Finalmente, se describen algunos benchmarks y software para simulación y comparación de algoritmos de navegación. Estos criterios se constituyen en una importante herramienta para diseñadores e investigadores en robótica móvil.

show abstract

Section: Error Finalunclassified

Section: Tbe Se Define Comounclassified

Criterios de desempeño para evaluar algoritmos de navegación de robots móviles: una revisión

Ceballos

Suarez-Rivera

2022

Rev. iberoam. autom. inform. ind.

View full text Add to dashboard Cite

show abstract

“…In the Q-Learning paradigm, an agent interacts with the environment and executes a set of actions as illustrated by Fig. 4 [15]. However, the update of state-action values in Q-Learning is defined by the following equation [15]− [17] :…”

Section: System Overviewmentioning

confidence: 99%

“…4 [15]. However, the update of state-action values in Q-Learning is defined by the following equation [15]− [17] :…”

Section: System Overviewmentioning

confidence: 99%

Implementation of Q-Learning Algorithm on Arduino: Application to Autonomous Mobile Robot Navigation in COVID-19 Field Hospitals

Latoui¹,

Daâchi²

2021

2021 International Conference on Electrical, Computer and Energy Technologies (ICECET)

View full text Add to dashboard Cite

Fig. 1. ELEGOO Smart Robot Car V3.0. by robots. Whereas, in 2020 this percentage have raised to reach 26% as cited in [1].Moreover, robots can be fixed in place or mobile as the International Organization for Standardization (ISO) said in [2]. However, Mobile Robots (MRs) are gaining more and more popularity. In fact, they are very used, specifically when it is a question to improve safety of workmen, performing tough and dangerous tasks. Otherwise, to do these kinds of tasks, MR must be navigated safely within its environment which

show abstract

“…Farias and friends developed an algorithm to control the position of a wheeled mobile robot. The main advantage is learning procedure which is done automatically with recursive procedure [24]. Farias and friends proposed an 3D simulation environment for control the position of a wheeled mobile robot.…”

Section: Introductionmentioning

confidence: 99%

Use of PID control during Education in Reinforcement Learning on Two Wheel Balance Robot

ATAÇ

Yildiz

ÜLKÜ

2021

Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji

View full text Add to dashboard Cite

This study's primary objective was to try to shorten the training time of the Reinforcement Learning (RL) method, which is one of the Machine Learning methods, by using the proportionalintegral-derivative (PID) control method during training. In this study, a balancing robot with two wheels that can be controlled independently on the same axis is used. While the robot is in balance, the RL software block follows how the PID block maintains the balance, and the RL blog learned how to behave against disturbing factors without physical falling/rising. In the training of RL, it is necessary to create approximately 500 policy/reward/path equations between the current state and future state matrices. The number of equations will increase considerably when subjects such as old position and acceleration are added. Approximately 1000 trial/error is required for training purposes in alone RL. This means many falling/rising cycles. With the method we present, the RL block has learned to keep the robot in balance without falling and requiring human intervention in 900 trials. The time spent for a fall/stand-up with RL alone was measured to be about 30 seconds (approximately 9 hours for 1000 attempts). On the other hand, PID-assisted learning took less than 4 hours of training since falling did not occur in many trials. This shows that the training period is shortened by approximately 60%.

show abstract

Reinforcement Learning for Position Control Problem of a Mobile Robot

Cited by 18 publications

References 33 publications

Criterios de desempeño para evaluar algoritmos de navegación de robots móviles: una revisión

Criterios de desempeño para evaluar algoritmos de navegación de robots móviles: una revisión

Implementation of Q-Learning Algorithm on Arduino: Application to Autonomous Mobile Robot Navigation in COVID-19 Field Hospitals

Use of PID control during Education in Reinforcement Learning on Two Wheel Balance Robot

Contact Info

Product

Resources

About