“…State representation [24], [53], [86], [87], [97] [22], [55], [138] Reward design [38], [119] [18] [93] [135] [23], [31], [33], [54] Abstract learning [27] [106], [107] [3], [16], [82], [134], [136] Offline RL [26] [1], [20], [39], [63], [116], [133], [140] Parallel learning [48], [114] [11], [32], [44], [58], [79], [80], [88], [113] Learning from demonstration [7], [35] [19]…”