A Survey of Zero-shot Generalisation in Deep Reinforcement Learning

Kirk, Robert; Zhang, Amy; Grefenstette, Edward; Rocktäschel, Tim

doi:10.48550/arxiv.2111.09794

Cited by 30 publications

(47 citation statements)

References 100 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, gradually training AVs with increasing risk level under curriculum learning [194] framework may help systems easily generalize to more types of safety-critical scenarios. One recent survey [195] that investigates the generalization problem in RL emphasizes the importance of environment generation in increasing the similarity between training and testing domains. This direction extends the scenario generation from safety to broader views that requires goalconditioned environment generation.…”

Section: What Are Future Directionsmentioning

confidence: 99%

A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective

Ding¹,

Xu²,

Arief³

et al. 2022

Preprint

View full text Add to dashboard Cite

Autonomous driving systems have witnessed a significant development during the past years thanks to the advance in machine learning-enabled sensing and decision-making algorithms. One critical challenge for their massive deployment in the real world is their safety evaluation. Most existing driving systems are still trained and evaluated on naturalistic scenarios collected from daily life or heuristically-generated adversarial ones. However, the large population of cars, in general, leads to an extremely low collision rate, indicating that the safety-critical scenarios are rare in the collected real-world data. Thus, methods to artificially generate scenarios become crucial to measure the risk and reduce the cost. In this survey, we focus on the algorithms of safety-critical scenario generation in autonomous driving. We first provide a comprehensive taxonomy of existing algorithms by dividing them into three categories: data-driven generation, adversarial generation, and knowledge-based generation. Then, we discuss useful tools for scenario generation, including simulation platforms and packages. Finally, we extend our discussion to five main challenges of current works -fidelity, efficiency, diversity, transferability, controllability -and research opportunities lighted up by these challenges.

show abstract

Section: What Are Future Directionsmentioning

confidence: 99%

A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective

Ding¹,

Xu²,

Arief³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In order to distinguish different environments, we follow the ideas in Kirk et al (2021) and consider a Contextual Partially Observable Markov Game, where we introduce a set of contexts K. 5 For each context k ∈ K we have a Partially Observable Markov Game (POMG) with the property that the state of the game can be decomposed into two parts, s = (k, s ) ∈ S k , where s ∈ S is the state and k ∈ K is the context. Formally: Definition 2.2 (Contextual Partially Observable Markov Game (CPOMG)).…”

Section: A Is the Joint Action Spacementioning

confidence: 99%

“…The learners then engage in a "meta-learning" problem with awareness of the contexts. This is the background of the framework introduced in Kirk et al (2021). Similarly, we will consider a discrete set of contexts in our analysis here.…”

Section: A Is the Joint Action Spacementioning

confidence: 99%

“…Treutlein et al, 2021;Hu et al, 2020). Our framework builds on Kirk et al (2021). Related to our approach is a strand of literature that assumes there exists a distribution of Markov-decision-problems of the scenario of interest, and then trains algorithms on a finite set of samples from this distribution before testing the behavior on the entire distribution (e.g.…”

Section: Introductionmentioning

confidence: 99%

“…In the economic literature, these games are typically referred to as Stochastic Games. Here, we follow the terminology that is standard in the machine learning literature.5 Note thatKirk et al (2021) focus on a single agent and not interactions of players. Their formalization is therefore based on Partially Observable Markov Decision Problems.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Robust Algorithmic Collusion

Eschenbaum¹,

Mellgren²,

Zahn³

2022

Preprint

View full text Add to dashboard Cite

This paper develops a formal framework to assess policies of learning algorithms in economic games. We investigate whether reinforcementlearning agents with collusive pricing policies can successfully extrapolate collusive behavior from training to the market. We find that in testing environments collusion consistently breaks down. Instead, we observe static Nash play. We then show that restricting algorithms' strategy space can make algorithmic collusion robust, because it limits overfitting to rival strategies. Our findings suggest that policy-makers should focus on firm behavior aimed at coordinating algorithm design in order to make collusive policies robust.

show abstract