“…Another long line of work features collision models where rewards are lower if multiple agents simultaneously pull the same arm (e.g., [1,5,13,21,30,41,42,45,50]), unlike our model. Along these lines, other reward structures have been studied, such as reward being a function of the agents' joint action (e.g., [8,9,32]).…”