2016
DOI: 10.1007/978-3-319-28872-7_19
|View full text |Cite
|
Sign up to set email alerts
|

Beyond Geometric Path Planning: Learning Context-Driven Trajectory Preferences via Sub-optimal Feedback

Abstract: We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than those arising from simple geometric constraints on robot's trajectory, such as distance of the robot from human etc. Our preferences are rather governed by the surrounding context of various objects and human interactions in the environment. Such preferences makes the problem challenging because the criterion of defining a g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2016
2016

Publication Types

Select...
4
1

Relationship

5
0

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 40 publications
0
9
0
Order By: Relevance
“…Such demonstrations can be extremely challenging and nonintuitive to provide for many high DoF manipulators [2]. Instead, we found [21,22] that it is more intuitive for users to give incremental feedback on high DoF arms by improving upon a proposed trajectory. We now summarize three feedback mechanisms that enable the user to iteratively provide improved trajectories.…”
Section: B Feedback Mechanismsmentioning
confidence: 99%
See 1 more Smart Citation
“…Such demonstrations can be extremely challenging and nonintuitive to provide for many high DoF manipulators [2]. Instead, we found [21,22] that it is more intuitive for users to give incremental feedback on high DoF arms by improving upon a proposed trajectory. We now summarize three feedback mechanisms that enable the user to iteratively provide improved trajectories.…”
Section: B Feedback Mechanismsmentioning
confidence: 99%
“…Parts of this work has been published at NIPS and ISRR conferences[22,21]. This journal submission presents a consistent full paper, and also includes the proof of regret bounds, more details of the robotic system, and a thorough related work.…”
mentioning
confidence: 99%
“…We differ from these in that we learn the cost function capturing preferences arising during human-object interactions. Jain et al [32], [7] learned a context-rich cost via iterative feedback from nonexpert users. Similarly, we also learn from the preference data of non-expert users.…”
Section: Related Workmentioning
confidence: 99%
“…The user feedback in PlanIt is in contrast to other learningbased approaches such as learning from the expert's demonstrations (LfD) [14], [15], [16], [39] or the co-active feedback [32], [7]. In both LfD and co-active learning approaches it is time consuming and expensive to collect the preference data on a robotic platform and across many environments.…”
Section: Planit: a Crowdsourcing Enginementioning
confidence: 99%
See 1 more Smart Citation