Robotics: Science and Systems XVI 2020
DOI: 10.15607/rss.2020.xvi.055
|View full text |Cite
|
Sign up to set email alerts
|

Learning from Interventions: Human-robot interaction as both explicit and implicit feedback

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
25
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 27 publications
(28 citation statements)
references
References 19 publications
0
25
0
Order By: Relevance
“…The approach presented in this paper focuses on when the corrections are generalized beyond kinematic variables and inputted using an external controller. Existing methods for corrections using an external controller [11], [12], [13] allow for path transformations and policy learning of UAVs and mobile robots. However, similar to the methods in pHRI, the main emphasis is on corrections to kinematics and on updating the trajectory or policy rather than using task knowledge to inform possible corrective input.…”
Section: Related Workmentioning
confidence: 99%
“…The approach presented in this paper focuses on when the corrections are generalized beyond kinematic variables and inputted using an external controller. Existing methods for corrections using an external controller [11], [12], [13] allow for path transformations and policy learning of UAVs and mobile robots. However, similar to the methods in pHRI, the main emphasis is on corrections to kinematics and on updating the trajectory or policy rather than using task knowledge to inform possible corrective input.…”
Section: Related Workmentioning
confidence: 99%
“…There has also been significant recent interest in active preference queries for learning reward functions from pairwise preferences over demonstrations [7,10,13,19,32,39]. However, many forms of human advice can be unintuitive, since the learner may visit states that are significantly far from those the human supervisor would visit, making it difficult for humans to judge what correct behavior looks like without interacting with the environment themselves [36,42].…”
Section: Background and Related Workmentioning
confidence: 99%
“…Kurenkov et al [26] and Xie et al [48] leverage interventions from suboptimal supervisors to accelerate policy learning, but assume that the supervisors are algorithmic and thus can be queried cheaply. Amir et al [2], Kahn et al [23], Kelly et al [24], Spencer et al [42], and Wang et al [47] instead consider learning from human supervisors and present learning algorithms which utilize the timing and nature of human interventions to update the learned policy. By giving the human control for multiple timesteps in a row, these algorithms show improvements over methods that only hand over control on a state-by-state basis [6].…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, there is an inherent cost of time and effort to querying the user for input and so the user-in-the-loop learning process can be prohibitively frustrating [17], [18]. The user tends to have a preference towards online learning approaches (e.g., [14]) that do not require post-hoc corrections to the learned robot policies. Therefore, in this work, we improve task learning during run-time by soliciting different types of demonstrations from the human teacher.…”
Section: Related Workmentioning
confidence: 99%