An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users

Bignold, Adam; Cruz, Francisco; Dazeley, Richard; Vamplew, Peter; Foale, Cameron

doi:10.3390/biomimetics6010013

Cited by 12 publications

(12 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A number of different agents and simulated users have been designed and applied to the mountain car and self-driving car environments. Simulated users have been chosen over actual human trials, as they allow rapid and controlled experiments [38]. When employing simulated users, interaction characteristics such as knowledge level, accuracy, and availability can be set to specific and measurable levels.…”

Section: Experimental Methodologymentioning

confidence: 99%

“…The mountain car environment is used in these experiments since it is a common benchmark problem in RL with sufficient complexity to effectively test agents and simple enough for human observers to intuitively calculate the correct policy. Additionally, the mountain car environment has been previously used in a human trial evaluating different advice delivery styles [3] and with simulated user [38]. We use the results reported in the human trial to set a realistic level of interaction for evaluative and informative advice agents.…”

Section: Non-persistent and Persistent State-based Agentsmentioning

confidence: 99%

“…To allow quick, bias-reduced, repeatable testing of the agents, simulated users are used as trainers in place of humans. Simulated users offer a method for performing indicative evaluations of RL agents that require human input, with controlled parameters [38]. There are two types of simulated users required for the following experiments, one must provide state-based advice, and the other must provide rule-based advice.…”

Section: Simulated Usersmentioning

confidence: 99%

See 2 more Smart Citations

Persistent Rule-based Interactive Reinforcement Learning

Bignold¹,

Cruz²,

Dazeley³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to interactions that offer relevant advice to the current state only. Additionally, the information provided by each interaction is not retained and instead discarded by the agent after a single-use. In this work, we propose a persistent rule-based interactive reinforcement learning approach, i.e., a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer. Moreover, rule-based advice shows similar performance impact as state-based advice, but with a substantially reduced interaction count.

show abstract

Section: Experimental Methodologymentioning

confidence: 99%

Section: Non-persistent and Persistent State-based Agentsmentioning

confidence: 99%

Section: Simulated Usersmentioning

confidence: 99%

See 1 more Smart Citation

Persistent Rule-based Interactive Reinforcement Learning

Bignold¹,

Cruz²,

Dazeley³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…While the interactive agent's human-related approach to learning is one of its greatest strengths, it may also be its greatest weakness [33], [34]. Advice with good accuracy given in the proper time will help the agent a lot in speeding up the speed of finding the optimal solution.…”

Section: Interactive Feedbackmentioning

confidence: 99%

A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments

Nguyen¹,

Cruz²,

Dazeley³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Deep Reinforcement Learning (DeepRL) methods have been widely used in robotics to learn about the environment and acquire behaviors autonomously. Deep Interactive Reinforcement Learning (DeepIRL) includes interactive feedback from an external trainer or expert giving advice to help learners choosing actions to speed up the learning process. However, current research has been limited to interactions that offer actionable advice to only the current state of the agent. Additionally, the information is discarded by the agent after a single use that causes a duplicate process at the same state for a revisit. In this paper, we present Broad-persistent Advising (BPA), a broadpersistent advising approach that retains and reuses the processed information. It not only helps trainers to give more general advice relevant to similar states instead of only the current state but also allows the agent to speed up the learning process. We test the proposed approach in two continuous robotic scenarios, namely, a cart pole balancing task and a simulated robot navigation task. The obtained results show that the performance of the agent using BPA improves while keeping the number of interactions required for the trainer in comparison to the DeepIRL approach.

show abstract

“…The first is the time required by the human. In this regard, it is important that the mechanisms used to provide advice to the agent serve to reduce the number of interactions required [18]. The second barrier is the skill needed by the human to provide the information.…”

Section: Introductionmentioning

confidence: 99%

Human engagement providing evaluative and informative advice for interactive reinforcement learning

Bignold

Cruz

Dazeley

et al. 2022

Neural Comput & Applic

Self Cite

View full text Add to dashboard Cite

Interactive reinforcement learning proposes the use of externally sourced information in order to speed up the learning process. When interacting with a learner agent, humans may provide either evaluative or informative advice. Prior research has focused on the effect of human-sourced advice by including real-time feedback on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while minimising the time demands on the human. This work focuses on answering which of two approaches, evaluative or informative, is the preferred instructional approach for humans. Moreover, this work presents an experimental setup for a human trial designed to compare the methods people use to deliver advice in terms of human engagement. The results obtained show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent’s ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice.

show abstract

An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users

Cited by 12 publications

References 41 publications

Persistent Rule-based Interactive Reinforcement Learning

Persistent Rule-based Interactive Reinforcement Learning

A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments

Human engagement providing evaluative and informative advice for interactive reinforcement learning

Contact Info

Product

Resources

About