Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning

Ren, Zhipeng; Dong, Daoyi; Li, Huaxiong; Chen, Chunlin

doi:10.1109/tnnls.2018.2790981

Cited by 102 publications

(54 citation statements)

References 36 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This mapping results from sampling the training set according to the probabilities at the current epoch p (e) . Minibatches are then formed from {X , Y} c and the probabilities are decayed towards a uniform distribution[2], based on the following function[12]: exp(−cn 2 i /10) ∀e > 0,…”

mentioning

confidence: 99%

Medical-based Deep Curriculum Learning for Improved Fracture Classification

Jiménez-Sánchez

Mateus

Kirchhoff

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Current deep-learning based methods do not easily integrate to clinical protocols, neither take full advantage of medical knowledge. In this work, we propose and compare several strategies relying on curriculum learning, to support the classification of proximal femur fracture from X-ray images, a challenging problem as reflected by existing intraand inter-expert disagreement. Our strategies are derived from knowledge such as medical decision trees and inconsistencies in the annotations of multiple experts, which allows us to assign a degree of difficulty to each training sample. We demonstrate that if we start learning "easy" examples and move towards "hard", the model can reach a better performance, even with fewer data. The evaluation is performed on the classification of a clinical dataset of about 1000 X-ray images. Our results show that, compared to class-uniform and random strategies, the proposed medical knowledge-based curriculum, performs up to 15% better in terms of accuracy, achieving the performance of experienced trauma surgeons.

show abstract

mentioning

confidence: 99%

Medical-based Deep Curriculum Learning for Improved Fracture Classification

Jiménez-Sánchez

Mateus

Kirchhoff

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…An RL agent learns a map between the environment state space and the action space through its interaction with the environment including observing the system's state, selecting and executing actions, and getting numerical action reward [23]. e mathematical theoretical basis of RL is discrete-time finite-state MDPs [24]. In a general way, a five-element tuple…”

Section: Reinforcementmentioning

confidence: 99%

Exploration Entropy for Reinforcement Learning

Xin

Qin

et al. 2020

Mathematical Problems in Engineering

View full text Add to dashboard Cite

The training process analysis and termination condition of the training process of a Reinforcement Learning (RL) system have always been the key issues to train an RL agent. In this paper, a new approach based on State Entropy and Exploration Entropy is proposed to analyse the training process. The concept of State Entropy is used to denote the uncertainty for an RL agent to select the action at every state that the agent will traverse, while the Exploration Entropy denotes the action selection uncertainty of the whole system. Actually, the action selection uncertainty of a certain state or the whole system reflects the degree of exploration and the stage of the learning process for an agent. The Exploration Entropy is a new criterion to analyse and manage the training process of RL. The theoretical analysis and experiment results illustrate that the curve of Exploration Entropy contains more information than the existing analytical methods.

show abstract

“…Schaul et al [39] introduces Prioritize Experience Replay (PER) which priorities important experiences to be sampled from the replay buffer to generate the examples following curriculum learning scheme. Subsequently, Ren et al [40] combines self-paced prioritize function and coverage penalty function which could select samples with appropriate difficulty with penalty when samples are replayed frequently. Another studies, such as [41] and [42] use curriculum learning to schedule ordered list of task and maps to be solved by the RL agent.…”

Section: Curriculum Learningmentioning

confidence: 99%

A Framework for DRL Navigation With State Transition Checking and Velocity Increment Scheduling

Dewa

Miura

2020

IEEE Access

View full text Add to dashboard Cite

To train a mobile robot to navigate using end-to-end approach which maps sensors data into actions, we can use deep reinforcement learning (DRL) method by providing training environments with proper reward functions. Although some studies have shown the success of DRL in navigation task for mobile robots, the method needs appropriate hyperparameter settings such as the environment's timestep size and the robot's velocity range to produce a good navigation policy. The previous existing DRL framework has proposed the use of odometry sensor to generate dynamic timestep size in the environment to solve the mismatch problem between the timestep size and the robot's velocity. However, the framework lacks a procedure for checking terminal conditions which may occur during action executions resulting inconsistency in the environment and poor navigation policies. In the case of navigation task, terminal conditions may happen when the robot achieves the navigation goal position or collides with obstacles while performing an action in one timestep. To cope with this problem, we propose a state transition checking method in the DRL environment which is specific for navigation task that leverages odometry and laser sensor to ensure that the environment follows Markov Decision Process with dynamic timestep size. We also introduce a velocity increment scheduling to stabilize the mobile robot during training. Our experiment results show that state transition checking along with the velocity increment scheduling are able to make the robot navigate faster with higher success rate compared to other existing DRL frameworks.

show abstract

Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning

Cited by 102 publications

References 36 publications

Medical-based Deep Curriculum Learning for Improved Fracture Classification

Medical-based Deep Curriculum Learning for Improved Fracture Classification

Exploration Entropy for Reinforcement Learning

A Framework for DRL Navigation With State Transition Checking and Velocity Increment Scheduling

Contact Info

Product

Resources

About