Accelerated Quality-Diversity for Robotics through Massive Parallelism

Lim, Bryan; Allard, Maxime; Luca, Grillotti,; Cully, Antoine

doi:10.48550/arxiv.2202.01258

Cited by 4 publications

(13 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since CMA-MAE is a general purpose algorithm, we are excited about future work that will test the CMA-MAE variants in domains beyond robot locomotion, such as robotic manipulation [25] and scenario generation [26]. Future work will also test the pretrained controllers in the real world and will explore the computational benefits of recent hardwareaccelerated frameworks [56].…”

Section: Discussionmentioning

confidence: 99%

Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

Tjanaka¹,

Fontaine²,

Aniruddha³

et al. 2022

Preprint

View full text Add to dashboard Cite

Pre-training a diverse set of robot controllers in simulation has enabled robots to adapt online to damage in robot locomotion tasks. However, finding diverse, highperforming controllers requires specialized hardware and extensive tuning of a large number of hyperparameters. On the other hand, the Covariance Matrix Adaptation MAP-Annealing algorithm, an evolution strategies (ES)-based quality diversity algorithm, does not have these limitations and has been shown to achieve state-of-the-art performance in standard benchmark domains. However, CMA-MAE cannot scale to modern neural network controllers due to its quadratic complexity. We leverage efficient approximation methods in ES to propose three new CMA-MAE variants that scale to very high dimensions. Our experiments show that the variants outperform ES-based baselines in benchmark robotic locomotion tasks, while being comparable with state-of-the-art deep reinforcement learningbased quality diversity algorithms. Source code and videos are available at https://scalingcmamae.github.io Compute archive improvement Adapt and with ES Sample solutions Evaluate policies and insert into archive

show abstract

Section: Discussionmentioning

confidence: 99%

Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

Tjanaka¹,

Fontaine²,

Aniruddha³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Recent advances in hardware acceleration have led to new QD libraries such as QDax [20] or EvoJax [21]. These tools rely on highly-parallelised simulators like Brax [22] that can run on accelerators (e.g., GPUs and TPUs) and thus target simulated domains, for example, robotics control, where they drastically reduce the evaluation time.…”

Section: B Hardware-accelerated Quality-diversitymentioning

confidence: 99%

“…In addition, they have given us access to 10 or 100 times more evaluations per generation within the same amount of time. Lim et al [20] prove that the performance of MAP-Elites is robust to large increases in batch-size values (i.e. large increases in the number of solutions generated per generation).…”

Section: B Hardware-accelerated Quality-diversitymentioning

confidence: 99%

“…Lim et al [20] studied the scalability of MAP-Elites to large batch-size (i.e. number of offspring generated per generation).…”

Section: Sampling-size: Alternative For Comparabilitymentioning

confidence: 99%

“…Nonetheless, recent advances in computer systems enable the high-parallelisation of evaluations. Recent libraries such as QDax [20], or EvoJax [21] based on the Brax simulator [22] allowed to speedup computation by a large order of magnitude thanks to the high-parallelisation of evaluations. With such tools, we now have access to 10 or 100 times more evaluations per generation within the same amount of time.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains

Flageat

Chalumeau

Cully

2023

ACM Trans. Evol. Learn. Optim.

View full text Add to dashboard Cite

Quality-Diversity algorithms, among which MAP-Elites, have emerged as powerful alternatives to performance-only optimisation approaches as they enable generating collections of diverse and high-performing solutions to an optimisation problem. However, they are often limited to low-dimensional search spaces and deterministic environments. The recently introduced Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites) algorithm overcomes this limitation by pairing the traditional Genetic operator of MAP-Elites with a gradient-based operator inspired by Deep Reinforcement Learning. This new operator guides mutations toward high-performing solutions using policy-gradients. In this work, we propose an in-depth study of PGA-MAP-Elites. We demonstrate the benefits of policy-gradients on the performance of the algorithm and the reproducibility of the generated solutions when considering uncertain domains. We first prove that PGA-MAP-Elites is highly performant in both deterministic and uncertain high-dimensional environments, decorrelating the two challenges it tackles. Secondly, we show that in addition to outperforming all the considered baselines, the collections of solutions generated by PGA-MAP-Elites are highly reproducible in uncertain environments, approaching the reproducibility of solutions found by Quality-Diversity approaches built specifically for uncertain applications. Finally, we propose an ablation and in-depth analysis of the dynamic of the policy-gradients-based variation. We demonstrate that the policy-gradient variation operator is determinant to guarantee the performance of PGA-MAP-Elites but is only essential during the early stage of the process, where it finds high-performing regions of the search space.

show abstract

Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces

Raphaël¹,

Richard²,

Donà³

et al. 2023

Proceedings of the Genetic and Evolutionary Computation Conference

View full text Add to dashboard Cite

Figure 1: MAP-Elites with Gradients Informed Discrete Emitter (me-gide). At each iteration, a discrete solution (here a sequence of letters from a finite vocablulary) is sampled in the repertoire. Gradients are computed over continuous fitness and descriptor functions with respect to their discrete inputs. Gradients are linearly combined to favour higher fitness and exploration of the descriptor space. Probabilities of mutation over the neighbours of the element are derived from this gradient information. Finally, a mutant is sampled according to those probabilities and inserted back in the repertoire.

show abstract

Accelerated Quality-Diversity for Robotics through Massive Parallelism

Cited by 4 publications

References 40 publications

Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains

Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces

Contact Info

Product

Resources

About