“…Ensemble Q-functions: Ensembles of Q-functions have been used in RL to consider model uncertainty (Faußer & Schwenker, 2015;Osband et al, 2016;Anschel et al, 2017;Agarwal et al, 2020;Lee et al, 2021;Lan et al, 2020;Chen et al, 2021b). Ensemble transition models: Ensembles of transition (and reward) models have been introduced to model-based RL, e.g., (Chua et al, 2018;Kurutach et al, 2018;Janner et al, 2019;Lee et al, 2020;Hiraoka et al, 2020;Abraham et al, 2020). The methods proposed in the above studies use a large ensemble of Q-functions or transition models, thus are computationally intensive.…”