“…8) see Fig.3.8, which illustrates that PI can be viewed as Newton's method for solving the Bellman equation in the function space of cost functions J.The interpretation of PI as a form of Newton's method has a long history, for which we refer to the original papers byKleinman [Klei68] for linear quadratic problems, and by Pollatschek and Avi-Itzhak[PoA69] for the finite-state discounted and Markov game cases. Subsequent works, which address broader classes of problems and algorithmic variations, include (among others) Hewer[Hew71], Puterman and Brumelle[PuB78],[PuB79], Saridis and Lee[SaL79] (following Rekasius[Rek64]), Beard[Bea95], Beard, Saridis, and Wen [BSW99], Santos and Rust[SaR04], Bokanowski, Maroso, and Zidani [BMZ09], Hylla[Hyl11], Magirou, Vassalos, and Barakitis [MVB20], Bertsekas[Ber21c], and Kundu and Kunitsch[KuK21]. Some of these papers include superlinear convergence rate results.RolloutGenerally, rollout with base policy µ can be viewed as a single iteration of Newton's method starting from J µ , as applied to the solution of the Bellman equation (see Fig.3.8).…”