Robotics: Science and Systems XI 2015
DOI: 10.15607/rss.2015.xi.032
|View full text |Cite
|
Sign up to set email alerts
|

Shared Autonomy via Hindsight Optimization

Abstract: Abstract-In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
145
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 135 publications
(146 citation statements)
references
References 22 publications
1
145
0
Order By: Relevance
“…IRL assumes access to high-quality demonstrations of the task. However, this is rarely available in robotics, where it is difficult to control high degree-of-freedom (DOF) robots [15,27,29]. Preference-based learning methods on the other are hand are very inefficient since they attempt to learn a continuous reward function from binary feedback.…”
Section: Introductionmentioning
confidence: 99%
“…IRL assumes access to high-quality demonstrations of the task. However, this is rarely available in robotics, where it is difficult to control high degree-of-freedom (DOF) robots [15,27,29]. Preference-based learning methods on the other are hand are very inefficient since they attempt to learn a continuous reward function from binary feedback.…”
Section: Introductionmentioning
confidence: 99%
“…In such interaction paradigms, the robot aims to infer a cost function or policy that best describes the examples that it has received. New avenues of research focus on learning such robot objectives from human input through demonstrations [9], [10], teleoperation data [11], corrections [12], [13], comparisons [14], examples of what constitutes a goal [15], or even specified proxy objectives [16]. In this paper, we focus on learning from two of such types of human inputdemonstrations and physical corrections -although we stress that the principles outlined in our formalism are more general and could be applied to the other interaction modes mentioned.…”
Section: A Robots Learning From Humansmentioning
confidence: 99%
“…To approximate the intractable integral in (12), we sampled a set X of 1500 trajectories. We sampled costs according to (11) given by random unit norm θs, then optimized them with an off-the-shelf trajectory optimizer. We used TrajOpt [41], which is based on sequential quadratic programming and uses convex-convex collision checking.…”
Section: B Approximationmentioning
confidence: 99%
See 2 more Smart Citations