Figure 1. Driving scenarios from our new benchmark where the agent needs to react to dynamic changes in the environment, handle clutter (only part of the environment is causally relevant), and predict complex sensorimotor controls (lateral and longitudinal). We show that Behavior Cloning yields state-of-the-art policies in these complex scenarios and investigate its limitations.
AbstractDriving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, including in unseen environments, executing complex lateral and longitudinal maneuvers without these reactions being explicitly programmed. However, we confirm well-known limitations (due to dataset bias and overfitting), new generalization issues (due to dynamic objects and the lack of a causal model), and training instability requiring further research before behavior cloning can graduate to real-world driving. We will release our benchmark and code.