“…Non-asymptotic analyses for critic only methods have been extensively studied recently, e.g., TD Lakshminarayanan & Szepesvari, 2018;Bhandari et al, 2018;Cai et al, 2019;Sun et al, 2019;, SARSA (Zou et al, 2019), gradient TD (GTD) method (Dalal et al, 2018;Xu et al, 2019;Wang et al, 2021;2017;Liu et al, 2015;Gupta et al, 2019;Kaledin et al, 2020;Ma et al, 2020;Wang & Zou, 2020). There are also non-asymptotic analyses for actor only method, e.g., (Bhandari & Russo, 2021;Agarwal et al, 2021;Mei et al, 2020;Li et al, 2021a;Laroche & des Combes, 2021;Zhang et al, 2021;Cen et al, 2021;Zhang et al, 2020a;Lin, 2022). In this paper, we focus on AC and NAC algorithms, where how the errors in the actor and the critic affects the other needs to be analyzed.…”