“…We find this especially promising for meta-learning, potentially building on LEO (Rusu et al, 2018). Inspired by DCEM, other more powerful sampling-based optimizers could be made differentiable in the same way, potentially optimizers that leverage gradient-based information in the inner optimization steps (Sekhon & Mebane, 1998;Theodorou et al, 2010;Stulp & Sigaud, 2012;Maheswaranathan et al, 2018) or by also learning the hyper-parameters of structured optimizers (Li & Malik, 2016;Volpp et al, 2019;Chen et al, 2017).…”