“…This has a very practical benefit as it limits the length of applicable interaction history. Current work, also typically assumes a stationary task state distribution for meta-learning (Doshi-Velez and Konidaris, 2013;Wang et al, 2016;Zintgraf et al, 2018;Rakelly et al, 2019;Zintgraf et al, 2019;Humplik et al, 2019;Fakoor et al, 2019;Perez et al, 2020). However, this framework has also has been readily applicable to more challenging multi-agent learning settings (Da Silva et al, 2006;Amato et al, 2013;Marinescu et al, 2017;Vezhnevets et al, 2019).…”