“…In principle, in two-person zero-sum IIEGs, the most popular method for learning approximate Nash equilibrium is regret minimization methods [Hoda et al, 2010, Zinkevich et al, 2007, Bowling et al, 2015, Brown and Sandholm, 2019a, Farina et al, 2020a, 2021a. They have been used to construct many AI milestones in poker [Moravčík et al, Table 1: Comparison of the recent bandit regret minimization methods Algorithm Model-Free Convergence Rate [Lanctot et al, 2009] O((X √ B + Y √ C)/ √ T ) [Zhou et al, 2019] O(max(X Farina et al, 2020b] O((X √ B + Y √ C)/ √ T ) [Farina and Sandholm, 2021] O(poly(X, B, Y, C)/T 1/4 ) [Farina et al, 2021b] O(((XB) 3/2 + (Y C) 3/2 )/ √ T ) 2 [Kozuno et al, 2021] O((X √ B + Y √ C)/ √ T ) 3 [Bai et al, 2022] O((…”