“…, they have constant behaviour over time, recently, a new set of techniques for non-stationary MAB settings have been proposed and showed promising results in a wide range of applications in the Internet advertising and dynamic pricing fields, but not environmental monitoring. 31–34 This framework is usually described as a slot machine game with several arms characterized by different rewards, which in the non-stationary case might change as the game progresses. At the beginning of the game, the player will pull the arms randomly, not having any previous knowledge of the rewards, while, as the game progresses, they will focus on the most promising arm, pulling the others less frequently.…”