Pascal Klink scite author profile

Pascal Klink

4Publications

14Citation Statements Received

56Citation Statements Given

How they've been cited

How they cite others

Affiliations

Technical University of Darmstadt

Publications

Order By: Most citations

Generalized Mean Estimation in Monte-Carlo Tree Search

Dam

Klink

D’Eramo

et al. 2020

View full text Add to dashboard Cite

We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT, and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. state of the art algorithms.

show abstract

A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning

Klink¹,

Abdulsamad²,

Belousov³

et al. 2021

Preprint

View full text Add to dashboard Cite

Across machine learning, the use of curricula has shown strong empirical potential to improve learning from data by avoiding local optima of training objectives. For reinforcement learning (RL), curricula are especially interesting, as the underlying optimization has a strong tendency to get stuck in local optima due to the exploration-exploitation trade-off. Recently, a number of approaches for an automatic generation of curricula for RL have been shown to increase performance while requiring less expert knowledge compared to manually designed curricula. However, these approaches are seldomly investigated from a theoretical perspective, preventing a deeper understanding of their mechanics. In this paper, we present an approach for automated curriculum generation in RL with a clear theoretical underpinning. More precisely, we formalize the well-known self-paced learning paradigm as inducing a distribution over training tasks, which trades off between task complexity and the objective to match a desired task distribution. Experiments show that training on this induced distribution helps to avoid poor local optima across RL algorithms in different tasks with uninformative rewards and challenging exploration requirements.

show abstract

Model-Based Reinforcement Learning from PILCO to PETS

Klink

2021

View full text Add to dashboard Cite

Self-Paced Deep Reinforcement Learning

Klink¹,

D’Eramo²,

Peters³

et al. 2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pascal Klink

Generalized Mean Estimation in Monte-Carlo Tree Search

A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning

Model-Based Reinforcement Learning from PILCO to PETS

Self-Paced Deep Reinforcement Learning

Contact Info

Product

Resources

About