2021
DOI: 10.3390/biomimetics6010013
|View full text |Cite
|
Sign up to set email alerts
|

An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users

Abstract: Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 41 publications
1
11
0
Order By: Relevance
“…A number of different agents and simulated users have been designed and applied to the mountain car and self-driving car environments. Simulated users have been chosen over actual human trials, as they allow rapid and controlled experiments [38]. When employing simulated users, interaction characteristics such as knowledge level, accuracy, and availability can be set to specific and measurable levels.…”
Section: Experimental Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…A number of different agents and simulated users have been designed and applied to the mountain car and self-driving car environments. Simulated users have been chosen over actual human trials, as they allow rapid and controlled experiments [38]. When employing simulated users, interaction characteristics such as knowledge level, accuracy, and availability can be set to specific and measurable levels.…”
Section: Experimental Methodologymentioning
confidence: 99%
“…The mountain car environment is used in these experiments since it is a common benchmark problem in RL with sufficient complexity to effectively test agents and simple enough for human observers to intuitively calculate the correct policy. Additionally, the mountain car environment has been previously used in a human trial evaluating different advice delivery styles [3] and with simulated user [38]. We use the results reported in the human trial to set a realistic level of interaction for evaluative and informative advice agents.…”
Section: Non-persistent and Persistent State-based Agentsmentioning
confidence: 99%
See 1 more Smart Citation
“…While the interactive agent's human-related approach to learning is one of its greatest strengths, it may also be its greatest weakness [33], [34]. Advice with good accuracy given in the proper time will help the agent a lot in speeding up the speed of finding the optimal solution.…”
Section: Interactive Feedbackmentioning
confidence: 99%
“…The first is the time required by the human. In this regard, it is important that the mechanisms used to provide advice to the agent serve to reduce the number of interactions required [18]. The second barrier is the skill needed by the human to provide the information.…”
Section: Introductionmentioning
confidence: 99%