Training a Robot via Human Feedback: A Case Study

Knox, W. Bradley; Stone, Peter; Breazeal, Cynthia

doi:10.1007/978-3-319-02675-6_46

Cited by 106 publications

(88 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Their algorithm uses reinforcement learning to improve over the initial sequences provided by the user, and it incorporates on-line feedback from the user during the learning process creating a novel dynamic reward shaping mechanism to converge faster to an optimal policy. Furthermore, in [18], the human trainer, an author of that study, followed a predetermined algorithm of giving positive reward for desired actions and negative reward otherwise. They explored how the Interactive Reinforcement Learning algorithm that enables a human trainer to provide both rewards and anticipatory guidance for the learner can be applied to a real-world robotic system.…”

Section: Related Workmentioning

confidence: 99%

A Negotiation-Based Genetic Framework for Multi-Agent Credit Assignment

Pashaei

Taghiyareh

Badie³

2014

Multiagent System Technologies

View full text Add to dashboard Cite

Abstract. Multi agent systems are a well-defined solution for implementing dynamic complex environments. One of the open issues of these systems is credit assignment problem. The main concern of credit assignment problem is to properly distributing feedback of overall performance, and brings about learning in each individual agent. In this paper a genetic framework for solving Multi-agent credit assignment problem is proposed. Our framework, Negotiation Based Credit Assignment, NBCA, applies negotiation for both enriching agents' knowledge as well as organizing populations by a mode analyzer. The proposed architecture includes a mentor agent which responsible for credit assignment without any context related information leading to a general solution. Furthermore, the mentor agent does not receive any information regarding correctness of a particular agent's behavior. Carry and non-Carry cases have been considered for evaluating this method. In addition, the effects of noise as a source of uncertainty on NBCA performance are examined. Our finding indicated that the proposed method is superior to previous credit assignment approaches. This is due to the argumentation and negotiation features of multi agent systems that are used to accomplish team learning and credit assignment respectively. The analysis of obtained results which are theoretically discussed, demonstrate that, in comparison with KEBCA (OR-type), our approach performs better than KEBCA after 5000 trials in 0% noisy environment. However, it performs worse than KEBCA in 10% and 30% noisy environment.

show abstract

Section: Related Workmentioning

confidence: 99%

A Negotiation-Based Genetic Framework for Multi-Agent Credit Assignment

Pashaei

Taghiyareh

Badie³

2014

Multiagent System Technologies

View full text Add to dashboard Cite

show abstract

“…The doll is designed to carry on an extensive conversation with young girls or boys in areas of their interest. Knox, Breazeal, and Stone (2013) present a case study of applying a framework for learning from human feedback to an interactive robot. They claim this application as a first demonstration of the ability to train multiple behaviors in robot learning from free-form human-generated feedback without any further guidance or evaluative feedback.…”

Section: Human-robot Social Interactionmentioning

confidence: 99%

Human–Robot Interaction

Sheridan

2016

Hum Factors

508

View full text Add to dashboard Cite

Objective:The current status of human-robot interaction (HRI) is reviewed, and key current research challenges for the human factors community are described.Background: Robots have evolved from continuous human-controlled master-slave servomechanisms for handling nuclear waste to a broad range of robots incorporating artificial intelligence for many applications and under human supervisory control.Methods: This mini-review describes HRI developments in four application areas and what are the challenges for human factors research.Results: In addition to a plethora of research papers, evidence of success is manifest in live demonstrations of robot capability under various forms of human control.Conclusions: HRI is a rapidly evolving field. Specialized robots under human teleoperation have proven successful in hazardous environments and medical application, as have specialized telerobots under human supervisory control for space and repetitive industrial tasks. Research in areas of self-driving cars, intimate collaboration with humans in manipulation tasks, human control of humanoid robots for hazardous environments, and social interaction with robots is at initial stages. The efficacy of humanoid generalpurpose robots has yet to be proven.Applications: HRI is now applied in almost all robot tasks, including manufacturing, space, aviation, undersea, surgery, rehabilitation, agriculture, education, package fetch and delivery, policing, and military operations.

show abstract

“…Some approaches use human feedback as shaping signals to teach a system how to achieve a task. In such approaches, the source of the feedback is considered as an observer of the system who evaluates each of the system's actions [26,27] or the system's entire policy [28]. The TAMER ( Training an Agent Manually via Evaluative Reinforcement) framework [26] proposes a method to shape a learning robot by giving positive and negative signals (as for a domestic dog).…”

Section: User Feedback For Smarter Homesmentioning

confidence: 99%

User in the Loop: Adaptive Smart Homes Exploiting User Feedback—State of the Art and Future Directions

et al. 2016

View full text Add to dashboard Cite

Due to the decrease of sensor and actuator prices and their ease of installation, smart homes and smart environments are more and more exploited in automation and health applications. In these applications, activity recognition has an important place. This article presents a general architecture that is responsible for adapting automation for the different users of the smart home while recognizing their activities. For that, semi-supervised learning algorithms and Markov-based models are used to determine the preferences of the user considering a combination of: (1) observations of the data that have been acquired since the start of the experiment and (2) feedback of the users on decisions that have been taken by the automation. We present preliminarily simulated experimental results regarding the determination of preferences for a user.

show abstract

Training a Robot via Human Feedback: A Case Study

Cited by 106 publications

References 17 publications

A Negotiation-Based Genetic Framework for Multi-Agent Credit Assignment

A Negotiation-Based Genetic Framework for Multi-Agent Credit Assignment

Human–Robot Interaction

User in the Loop: Adaptive Smart Homes Exploiting User Feedback—State of the Art and Future Directions

Contact Info

Product

Resources

About