2021
DOI: 10.17485/ijst/v14i30.1030
|View full text |Cite
|
Sign up to set email alerts
|

Control and Simulation of a 6-DOF Biped Robot based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Abstract: Objectives:To study an algorithm to control a bipedal robot to walk so that it has a gait close to that of a human. It is known that the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm is a highly efficient algorithm with a few changes compared to the popular algorithm -the commonly used Deep Deterministic Policy Gradient (DDPG) in the continuous action space problem in Reinforcement Learning. Methods: Different from the usual sparse reward function model used, in this study, a reward model com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 24 publications
(31 reference statements)
0
3
0
Order By: Relevance
“…In this research, an extension of the TD3 [20] algorithm was proposed to include more information about the connection between the joints of the robot in the training process. In fact, there are many articles [20][21][22][23][24][25] using reinforcement learning algorithms such as TD3, DDPG and SAC to find the desired angle values of the joints of the robot. However, their algorithms only used the information about the velocity and angular value of the joints for training, they did not take advantage of the graph topology and the binding relationship of the humanoid robot, as in our method.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this research, an extension of the TD3 [20] algorithm was proposed to include more information about the connection between the joints of the robot in the training process. In fact, there are many articles [20][21][22][23][24][25] using reinforcement learning algorithms such as TD3, DDPG and SAC to find the desired angle values of the joints of the robot. However, their algorithms only used the information about the velocity and angular value of the joints for training, they did not take advantage of the graph topology and the binding relationship of the humanoid robot, as in our method.…”
Section: Introductionmentioning
confidence: 99%
“…In each state, the height of the body robot has different values. In the paper [25], only an average value of the body height during motion is used as a basis height for the robot to learn, two average values of the body height corresponding to two grounding states of the legs in motion are used in this paper. At the single phase of walking (Figure 7a,b), the average height of the robot's body reaches a higher value than that at the double phase of walking (Figure 7c,d).…”
mentioning
confidence: 99%
“…Meanwhile, controllers based on fuzzy logic imitating human natural reasoning have been studied and applied in many control problems in general and in the control of bipedal robots-for example, a fuzzy logic-based controller for robot in mechanical processing [34,35], a fuzzy-based-admittance controller for safe natural human-robot interaction [36], and a fuzzy logic-based bipedal robot control [37][38][39]. Moreover, the integration of fuzzy logic with intelligent algorithms is also a research direction that is being applied to control bipedal robots [40][41][42].…”
Section: Introductionmentioning
confidence: 99%