2018
DOI: 10.48550/arxiv.1811.03516
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning from Demonstration in the Wild

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…in online video repositories such as Youtube), solving problems in existing textbooks, or solving existing machine learning benchmarks in language, logic, reinforcement learning, etc. There is a long history of fruitful research in imitation learning and learning via observation that demonstrates the benefits of exploiting such data [37,13,162,7,142,36,182,129,116,1]. AI-GAs too could benefit from this treasure trove of information.…”
Section: Discussionmentioning
confidence: 99%
“…in online video repositories such as Youtube), solving problems in existing textbooks, or solving existing machine learning benchmarks in language, logic, reinforcement learning, etc. There is a long history of fruitful research in imitation learning and learning via observation that demonstrates the benefits of exploiting such data [37,13,162,7,142,36,182,129,116,1]. AI-GAs too could benefit from this treasure trove of information.…”
Section: Discussionmentioning
confidence: 99%
“…Previously, Ziebart et al [6] and Ross et al [5] proposed general methods in Inverse Reinforcement Learning and Interactive Learning from Demonstration, with an empirical study on a driving game. More recently, Kuefler et al [16] and Behbahani et al [17] learn an end-to-end policy in a GAIL [13]-like manner. Codevilla et al [18] and Liang et al [19] share similar hierarchical perspective as us, but still control policies are completely neural.…”
Section: Related Work Classical Autonomous Driving Systemmentioning
confidence: 99%
“…Imitation learning is also known as learning from demonstrations or apprenticeship learning, whose goal is to learn how to perform a task directly from expert demonstrations, without any access to the reward signal r (s, a). Recent main lines of researches within imitation learning are behavioural cloning (BC) [6,39], which performs supervised learning from observations to actions when given a number of expert demonstrations; inverse reinforcement learning (IRL) [1], where a reward function is estimated that explains the demonstrations as (near) optimal behavior; and generative adversarial imitation learning (GAIL) [3,4,17,43], which is inspired by the generative adversarial networks (GAN) [15]. Let T E denote the trajectories generated by the behind expert policy π E , each of which consists of a sequence of state-action pairs.…”
Section: Generative Adversarial Imitation Learningmentioning
confidence: 99%
“…Then, we reuse the same pre-trained animal models as the default policies for the wounded animals in all experiments. The learning curves of IGASIL versus MADDPG and DDPG are plotted in Figure (3) under five random seeds (1,2,3,4,5). To show a smoother learning procedure, the reward value is averaged every 1000 episodes.…”
Section: Cooperative Endangered Wildlife Rescuementioning
confidence: 99%
See 1 more Smart Citation