2021
DOI: 10.1109/access.2021.3106662
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Investigation of Early Stopping Optimizations in Proximal Policy Optimization

Abstract: Code-level optimizations, which are low-level optimization techniques used in the implementation of algorithms, have generally been considered as tangential and often do not appear in published pseudo-code of Reinforcement Learning (RL) algorithms. However, recent studies suggest these optimizations to be critical to the performance of algorithms such as Proximal Policy Optimization (PPO). In this paper, we investigate the effect of one such optimization known as "early stopping" implemented for PPO in the pop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 1 publication
(2 reference statements)
0
2
0
Order By: Relevance
“…Then, they took it up a notch by adding calculated indices, propelling the accuracy to a commanding 99.58%, compared to others listed in Table 1. Techniques such as data augmentation [40], gradient clipping [41], adaptive learning rates [42], and early stopping [43] optimized performance and reduced computational time. Their WRN-based approach achieved a remarkable accuracy of 99.17%, setting a new benchmark in efficiency and accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…Then, they took it up a notch by adding calculated indices, propelling the accuracy to a commanding 99.58%, compared to others listed in Table 1. Techniques such as data augmentation [40], gradient clipping [41], adaptive learning rates [42], and early stopping [43] optimized performance and reduced computational time. Their WRN-based approach achieved a remarkable accuracy of 99.17%, setting a new benchmark in efficiency and accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…The combination of LSTM for movement prediction and PPO for decision-making enables the exoskeleton to provide targeted assistance that adapts to the user's changing needs and capabilities. The PPO model is known for its inherent ability to adjust and smooth out discontinuities making the response for smoothly track the user's intended movement [18]. This dynamic approach to control allows for a more natural interaction between the user and the exoskeleton, potentially enhancing the rehabilitation process by encouraging active participation and facilitating the correct execution of therapeutic movements.…”
Section: Reinforcement Learning Modelmentioning
confidence: 99%
“…Proximal policy optimization (PPO) is an on-policy deep reinforcement learning method developed by OpenAI in 2017, and it serves as the default deep reinforcement learning algorithm utilized by OpenAI [30]. Compared with off-policy deep reinforcement learning algorithms like deep Q-network (DQN) and deep deterministic policy gradient (DDPG), the PPO algorithm typically exhibits superior stability and convergence.…”
Section: Proximal Policy Optimization Algorithmmentioning
confidence: 99%