“…The RL agent has the action space of size 5 to choose among the 5 possible driving actions:
. Through the interaction with other vehicles, determine the acceleration of the ego vehicle, whether to change lanes [
34] to accelerate or to give way. Since this paper uses the GAIL‐based algorithm, it is not necessary to provide any environmental reward function for training.…”