“…Auer et al [2019], Besbes et al [2014], , Cheung et al [2022], Luo et al [2018], Russac et al [2019], Trovo et al [2020], Wu et al [2018]) although they do not deal with periodically behaved dynamical system properly (see discussions in [Cai et al, 2021] as well). For discrete action settings, periodic bandit [Oh et al, 2019] was proposed, which aims at optimizing for the total regret. Also, if the period is known, Gaussian process bandit for periodic reward functions was proposed (Cai et al [2021]) under Gaussian noise assumption.…”