“…However, reward binds the agent to a certain task for which the reward represents success. Aligned with the recent surge of interest in unsupervised methods in reinforcement learning (Baranes and Oudeyer, 2013;Bellemare et al, 2016;Gregor et al, 2016;Houthooft et al, 2016;Gupta et al, 2018;Hausman et al, 2018;Pong et al, 2019;Laskin et al, 2020Laskin et al, , 2021He et al, 2021) and previously proposed ideas (Schmidhuber, 1991a(Schmidhuber, , 2010, we argue that there exist properties of a dynamical system which are not tied to any particular task, yet highly useful, leveraging them can help solve other tasks more efficiently. This work focuses on the sensitivity of the produced trajectories of the system with respect to the policy so-called Physical Derivatives.…”