2011 IEEE International Conference on Rehabilitation Robotics 2011
DOI: 10.1109/icorr.2011.5975338
|View full text |Cite
|
Sign up to set email alerts
|

Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning

Abstract: As a contribution toward the goal of adaptable, intelligent artificial limbs, this work introduces a continuous actor-critic reinforcement learning method for optimizing the control of multi-function myoelectric devices. Using a simulated upper-arm robotic prosthesis, we demonstrate how it is possible to derive successful limb controllers from myoelectric data using only a sparse human-delivered training signal, without requiring detailed knowledge about the task domain. This reinforcement-based machine learni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
88
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 109 publications
(88 citation statements)
references
References 17 publications
0
88
0
Order By: Relevance
“…In this learning scenario, feedback can be restricted to express various intensities of approval and disapproval; such feedback is mapped to numeric "reward" that the agent uses to revise its behavior [2], [3], [8], [1], [9]. Compared to learning from demonstration, learning from human reward requires only a simple task-independent interface and may require less expertise and place less cognitive load on the trainer [10].…”
Section: A Learning From Human Rewardsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this learning scenario, feedback can be restricted to express various intensities of approval and disapproval; such feedback is mapped to numeric "reward" that the agent uses to revise its behavior [2], [3], [8], [1], [9]. Compared to learning from demonstration, learning from human reward requires only a simple task-independent interface and may require less expertise and place less cognitive load on the trainer [10].…”
Section: A Learning From Human Rewardsmentioning
confidence: 99%
“…The feedback that the human provides during such interaction can take many forms, e.g., reward and punishment [1], [2], [3], advice [4], guidance [5], or critiques [6]. Within them, learning from rewards generated by a human trainer observing the agent in action has proven to be a powerful method for human trainers who are not experts in autonomous agents to teach such agents to perform challenging tasks.…”
Section: Introductionmentioning
confidence: 99%
“…Reward and punishment are frequently received in a social context, from another social agent. In recent years, this form of communication and its machine-learning analog-reinforcement learning-have been adapted to permit teaching of artificial agents by their human users [4,14,6,13,11,10]. In this form of teaching-which we call interactive shaping-a user observes an agent's behavior while generating human reward instances through varying interfaces (e.g., keyboard, mouse, or verbal feedback); each instance is received by the learning agent as a time-stamped numeric value and used to inform future behavioral choices.…”
Section: Introductionmentioning
confidence: 99%
“…Accordingly, other algorithms for learning from human reward [4,21,20,16,18,13] do not directly account for delay, do not model human reward explicitly, and are not fully myopic (i.e., they employ discount factors greater than 0).…”
Section: Background On Tamermentioning
confidence: 99%
“…Though a few past projects have considered this problem of learning from human reward [4,21,20,16,18,13,9], only two of these implemented their solution for a robotic agent. In one such project [13], the agent learned partially in simulation and from hardcoded reward, demonstrations, and human reward.…”
Section: Introductionmentioning
confidence: 99%