“…Some approaches were developed to improve simulators for sample efficiency and better sim to real transfer [1,81]. Carefully selecting task-specific state representations [59,51], reward functions [13,14], and action spaces [124,141] has been shown to improve both the time to convergence and performance. To sum it up, integrating underlying physics about the learning task structure has been found to improve performance and accelerate convergence.…”