Robotics: Science and Systems XI 2015
DOI: 10.15607/rss.2015.xi.018
|View full text |Cite
|
Sign up to set email alerts
|

Grounding English Commands to Reward Functions

Abstract: Abstract-As intelligent robots become more prevalent, methods to make interaction with the robots more accessible are increasingly important. Communicating the tasks that a person wants the robot to carry out via natural language, and training the robot to ground the natural language through demonstration, are especially appealing approaches for interaction, since they do not require a technical background. However, existing approaches map natural language commands to robot command languages that directly expr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
65
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 49 publications
(66 citation statements)
references
References 19 publications
1
65
0
Order By: Relevance
“…The question of how to effectively convert between natural language instructions and robot behavior has been widely studied in previous work [50,34,24,14,9,47,8,18,11,1,33,28,36,27,40,7,2,19,37]. So far, there have been three categories of behavior specifications that these works have mapped natural language to: action sequences, goal states, and LTL specifications.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The question of how to effectively convert between natural language instructions and robot behavior has been widely studied in previous work [50,34,24,14,9,47,8,18,11,1,33,28,36,27,40,7,2,19,37]. So far, there have been three categories of behavior specifications that these works have mapped natural language to: action sequences, goal states, and LTL specifications.…”
Section: Related Workmentioning
confidence: 99%
“…Following the framework introduced by MacGlashan et al [33], we treat natural language as the specification of a latent reward function that completes the definition of an otherwise fully-specified MDP. We use a language grounding model to arrive at a more consolidated, semantic representation of that reward function, thereby completing the MDP and allowing it to be passed to an arbitrary planning algorithm for generating robot behavior.…”
Section: A Problem Settingmentioning
confidence: 99%
See 1 more Smart Citation
“…Here, we learn an adaptive strategy that aims at maximising the overall learning performance simultaneously, by properly adjusting the positive confidence threshold in the range of 0.65 to 0.95. We train the optimization using a RL libraryBurlap (MacGlashan, 2015) as follows, in detail:…”
Section: Adaptive Confidence Thresholdmentioning
confidence: 99%
“…There has been a broad and diverse set of work examining how best to interpret and execute natural language instructions on a robot platform (Vogel and Jurafsky, 2010;Tellex et al, 2011;Artzi and Zettlemoyer, 2013;Howard et al, 2014;Andreas and Klein, 2015;MacGlashan et al, 2015;Paul et al, 2016;Mei et al, 2016;Arumugam et al, 2017). Vogel and Jurafsky (2010) produce policies using language and expert trajectories based rewards, which allow for planning within a stochastic environment along with re-planning in case of failure.…”
Section: Related Workmentioning
confidence: 99%