“…However, while controllers behave well in idealized simulated environments, they often struggle when transferred to the real world, exhibiting infeasible motor-control behaviors due to the difference between simulation and real-world, which is often referred to as the reality gap. Some approaches propose to address the reality gap with conventional optimization methods such as MPC, allowing the policy to adjust on the real-robot [30,68,40,76]. On the other hand, others have investigated methods that leverage real-world data, such as learning on real robots [22,21,62], identifying system parameters [29], or adapting policy behaviors [53,83,36].…”