“…Other works considered ℓ p extensions, high dimensional variants, or improvements and applications of PSCO. 4 Several works have studied the private multiarmed bandit problem (Mishra & Thakurta, 2015;Tossou & Dimitrakakis, 2017;Sajed & Sheffet, 2019;Ren et al, 2020a;Chen et al, 2020;Zhou & Tan, 2021;Dubey, 2021), the private contextual linear bandit problem (Shariff & Sheffet, 2018;Zheng et al, 2020;Han et al, 2020;Ren et al, 2020b;Garcelon et al, 2022), and the more general private reinforcement learning (Vietri et al, 2020;Garcelon et al, 2021;Chowdhury & Zhou, 2022a) problem, in both local and centralized models of privacy. The regret gap between the two models (when the contexts are arbitrary, not stochastic (Han et al, 2021)) has shrunk using the intermediate sequential shuffle model (Tenenbaum et al, 2021;Chowdhury & Zhou, 2022b;Garcelon et al, 2022).…”