An Empirical Investigation of Early Stopping Optimizations in Proximal Policy Optimization

Dossa, Rousslan Fernand Julien; Huang, Sheng‐Yi; Ontañón, Santiago; Matsubara, Takashi

doi:10.1109/access.2021.3106662

Cited by 11 publications

(5 citation statements)

References 1 publication

(2 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, they took it up a notch by adding calculated indices, propelling the accuracy to a commanding 99.58%, compared to others listed in Table 1. Techniques such as data augmentation [40], gradient clipping [41], adaptive learning rates [42], and early stopping [43] optimized performance and reduced computational time. Their WRN-based approach achieved a remarkable accuracy of 99.17%, setting a new benchmark in efficiency and accuracy.…”

Section: Related Workmentioning

confidence: 99%

Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

Kunwar,

Ferdush

2024

RIG

View full text Add to dashboard Cite

As the global population continues to expand, the demand for natural resources increases. Unfortunately, human activities account for 23% of greenhouse gas emissions. On a positive note, remote sensing technologies have emerged as a valuable tool in managing our environment. These technologies allow us to monitor land use, plan urban areas, and drive advancements in areas such as agriculture, climate change mitigation, disaster recovery, and environmental monitoring. Recent advances in Artificial Intelligence (AI), computer vision, and earth observation data have enabled unprecedented accuracy in land use mapping. By using transfer learning and fine-tuning with red-green-blue (RGB) bands, we achieved an impressive 99.19% accuracy in land use analysis. Such findings can be used to inform conservation and urban planning policies.

show abstract

Section: Related Workmentioning

confidence: 99%

Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

Kunwar,

Ferdush

2024

RIG

View full text Add to dashboard Cite

show abstract

“…The combination of LSTM for movement prediction and PPO for decision-making enables the exoskeleton to provide targeted assistance that adapts to the user's changing needs and capabilities. The PPO model is known for its inherent ability to adjust and smooth out discontinuities making the response for smoothly track the user's intended movement [18]. This dynamic approach to control allows for a more natural interaction between the user and the exoskeleton, potentially enhancing the rehabilitation process by encouraging active participation and facilitating the correct execution of therapeutic movements.…”

Section: Reinforcement Learning Modelmentioning

confidence: 99%

Deep reinforcement learning to assess lower extremity movement intention and assist a rehabilitation exoskeleton

Dizor,

Raj,

Gonzalez

et al. 2024

Disruptive Technologies in Information Sciences VIII

View full text Add to dashboard Cite

This paper introduces a pioneering approach for controlling a unilateral lower extremity exoskeleton designed for rehabilitation and enhancing the quality of life for individuals with neuromuscular weakness of the lower limbs. At the core of our methodology is the integration of Long Short-Term Memory (LSTM) networks with Proximal Policy Optimization (PPO) models, utilizing a deep reinforcement learning framework to interpret and predict user movement intentions in real time. By harnessing sensor fusion that combines surface electromyography (sEMG) and inertial measurement units (IMU) from sensor arrays placed around the quadriceps and gastrocnemius muscles, our system employs an adaptive nonlinear sliding mode control with pneumatic artificial muscles (PAMs), thereby directing the exoskeleton's movement and positioning. The LSTM network processes temporal sequences of sensor data to capture the dynamics of human motion, while the PPO model optimizes the control policy to ensure smooth and responsive movements aligned with the user intentions. Focusing initially on basic maneuvers integral to Activities of Daily Living (ADL), our system demonstrates promising preliminary results in mimicking natural limb movements, laying the groundwork for future clinical applications. This paper specifically delves into the utilization of the LSTM-PPO framework for controlling an avatar prior to testing the exoskeleton, representing a significant step towards realizing a responsive and intuitive exoskeleton control system.

show abstract

“…Proximal policy optimization (PPO) is an on-policy deep reinforcement learning method developed by OpenAI in 2017, and it serves as the default deep reinforcement learning algorithm utilized by OpenAI [30]. Compared with off-policy deep reinforcement learning algorithms like deep Q-network (DQN) and deep deterministic policy gradient (DDPG), the PPO algorithm typically exhibits superior stability and convergence.…”

Section: Proximal Policy Optimization Algorithmmentioning

confidence: 99%

A Novel Two-Stage, Dual-Layer Distributed Optimization Operational Approach for Microgrids with Electric Vehicles

Zhou,

Zhang,

et al. 2023

Mathematics

View full text Add to dashboard Cite

As the ownership of electric vehicles (EVs) continues to rise, EVs are becoming an integral part of urban microgrids. Incorporating the charging and discharging processes of EVs into the microgrid’s optimization scheduling process can serve to load leveling, reducing the reliance of the microgrid on external power networks. This paper proposes a novel two-stage, dual-layer distributed optimization operational approach for microgrids with EVs. The lower layer is a distributed control layer, which ensures, through consensus control methods, that every EV maintains a consistent charging/discharging and state of charge (SOC). The upper layer is the optimization scheduling layer, determining the optimal operational strategy of the microgrid using the multiagent reinforcement learning method and providing control reference signals for the lower layer. Additionally, this paper categorizes the charging process of EVs into two stages based on their SOC: the constrained scheduling stage and the free scheduling stage. By employing distinct control methods during these two stages, we ensure that EVs can participate in the microgrid scheduling while fully respecting the charging interests of the EV owners.

show abstract

An Empirical Investigation of Early Stopping Optimizations in Proximal Policy Optimization

Cited by 11 publications

References 1 publication

Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

Deep reinforcement learning to assess lower extremity movement intention and assist a rehabilitation exoskeleton

A Novel Two-Stage, Dual-Layer Distributed Optimization Operational Approach for Microgrids with Electric Vehicles

Contact Info

Product

Resources

About