A typical goal for transfer learning algorithms is to utilize knowledge gained in a source task to learn a target task faster. Recently introduced transfer methods in reinforcement learning settings have shown considerable promise, but they typically transfer between pairs of very similar tasks. This work introduces Rule Transfer, a transfer algorithm that first learns rules to summarize a source task policy and then leverages those rules to learn faster in a target task. This paper demonstrates that Rule Transfer can effectively speed up learning in Keepaway, a benchmark RL problem in the robot soccer domain, based on experience from source tasks in the gridworld domain. We empirically show, through the use of three distinct transfer metrics, that Rule Transfer is effective across these domains.
Reinforcement learning agents typically require a significant amount of data before performing well on complex tasks. Transfer learning methods have made progress reducing sample complexity, but they have primarily been applied to model-free learning methods, not more data-efficient model-based learning methods. This paper introduces timbrel, a novel method capable of transferring information effectively into a model-based reinforcement learning algorithm. We demonstrate that timbrel can significantly improve the sample efficiency and asymptotic performance of a model-based algorithm when learning in a continuous state space. Additionally, we conduct experiments to test the limits of timbrel's effectiveness.
Abstract-Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful rangeadaptive tile coding method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.