Learning to Drive a Real Car in 20 Minutes

Riedmiller, Martin; Montemerlo, Michael; Dahlkamp, Hendrik

doi:10.1109/fbit.2007.37

Cited by 79 publications

(42 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…a larger number of transition tuples, the computational update to the value function is performed. Various batch RL algorithms have been proposed in the literature [9]- [11] and batch RL has recently been successfully applied to various challenging real-world applications [12], [13]. Figure 1 sketches the general batch reinforcement learning framework.…”

Section: Batch Reinforcement Learningmentioning

confidence: 99%

Improved neural fitted Q iteration applied to a novel computer gaming and learning benchmark

Gabel

Lutz

Riedmiller

2011

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

Self Cite

View full text Add to dashboard Cite

Abstract-Neural batch reinforcement learning (RL) algorithms have recently shown to be a powerful tool for model-free reinforcement learning problems. In this paper, we present a novel learning benchmark from the realm of computer games and apply a variant of a neural batch RL algorithm in the scope of this benchmark. Defining the learning problem and appropriately adjusting all relevant parameters is often a tedious task for the researcher who implements and investigates some learning approach. In RL, the suitable choice of the function c of immediate costs is crucial, and, when utilizing multi-layer perceptron neural networks for the purpose of value function approximation, the definition of c must be well aligned with the specific characteristics of this type of function approximator. Determining this alignment is especially tricky, when no a priori knowledge about the task and, hence, about optimal policies is available. To this end, we propose a simple, but effective dynamic scaling heuristic that can be seamlessly integrated into contemporary neural batch RL algorithms. We evaluate the effectiveness of this heuristic in the context of the well-known pole swing-up benchmark as well as in the context of the novel gaming benchmark we are suggesting.

show abstract

Section: Batch Reinforcement Learningmentioning

confidence: 99%

Improved neural fitted Q iteration applied to a novel computer gaming and learning benchmark

Gabel

Lutz

Riedmiller

2011

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Multilayer perceptrons (MLPs) [62] are known to be a very useful and robust regression method to approximate Q-functions in a broad range of different applications [63] [64]. However, some peculiarities have to be considered in order to make them work properly in practical applications.…”

Section: Batch-mode Reinforcement Learningmentioning

confidence: 99%

Cognitive concepts in autonomous soccer playing robots

Lauer

Hafner

Lange

et al. 2010

Cognitive Systems Research

Self Cite

View full text Add to dashboard Cite

Computational concepts of cognition, their implementation in complex autonomous systems, and their empirical evaluation are key techniques to understand and validate concepts of cognition and intelligence. In this paper we want to describe computational concepts of cognition that were successfully implemented in the domain of soccer playing robots and show the interactions between cognitive concepts, software engineering and real time application development. Beside a description of the general concepts we will focus on aspects of perception, behavior architecture, and reinforcement learning.

show abstract

“…In the scope of this work, we disregard the technical challenges of car control such as reliable vision and scene interpretation, track control, or the integration of sensing and acting (see [2] for a corresponding learning approach) and instead adopt a rather abstract, mu lti-agent point of view. While our focus is still on indiv idual autonomous agents residing in the vehicles, we focus on these agents' goals of implementing a suitable high-level car control requiring the interaction with other traffic participants.…”

Section: Introductionmentioning

confidence: 99%

The Cooperative Driver: Multi-Agent Learning for Preventing Traffic Jams

Gabel¹,

Riedmiller²

2013

IJTTE

Self Cite

View full text Add to dashboard Cite

The optimizat ion of traffic flow on roads and highways of modern industrialized countries is key to their economic gro wth and success. Besides, the reduction of traffic congestions and jams is also desirable fro m an ecological point of view as it yields a contribution to climate protection. In this art icle, we stick to a microscopic traffic simu lation model and interpret the task of traffic flow optimizat ion as a mult i-agent learn ing problem. In so doing, we attach simple, adaptive agents to each of the vehicles and make them learn, using a distributed variant of model-free reinforcement learning, a cooperative driving behavior that is jointly optimal and aims at the prevention of traffic jams. Ou r approach is evaluated in a series of simulation experiments that emphasize that the substitution of selfish human behavior in traffic by the learned driving policies of the agents can result in substantial improvements in the quality of traffic flow.

show abstract

Learning to Drive a Real Car in 20 Minutes

Cited by 79 publications

References 6 publications

Improved neural fitted Q iteration applied to a novel computer gaming and learning benchmark

Improved neural fitted Q iteration applied to a novel computer gaming and learning benchmark

Cognitive concepts in autonomous soccer playing robots

The Cooperative Driver: Multi-Agent Learning for Preventing Traffic Jams

Contact Info

Product

Resources

About