“…In addition to the reward shaping approach, the conventional PID ACC is used as a baseline which, like in the case of RL, is designed by dividing the control into phases for the in range and out of range conditions (Canale and Malan, 2003). The traction torque T t request is given by PID controller and an optimal gear is chosen based on the gear with the lowest fuel rate given the desired traction torque and vehicle velocity (Yoon et al, 2020;Kerbel et al, 2022).…”