Untitled

This paper resolves the open question of designing near-optimal algorithms for learning imperfectinformation extensive-form games from bandit feedback. We present the first line of algorithms that require only O((XA + Y B)/ε 2 ) episodes of play to find an ε-approximate Nash equilibrium in two-player zero-sum games, where X, Y are the number of information sets and A, B are the number of actions for the two players. This improves upon the best known sample complexity of O((X 2 A + Y 2 B)/ε 2 ) by a factor of O(max{X, Y }), and matches the information-theoretic lower bound up to logarithmic factors. We achieve this sample complexity by two new algorithms: Balanced Online Mirror Descent, and Balanced Counterfactual Regret Minimization. Both algorithms rely on novel approaches of integrating balanced exploration policies into their classical counterparts. We also extend our results to learning Coarse Correlated Equilibria in multi-player general-sum games.

show abstract

Neutron energy spectrum measurement of the Back-n white neutron source at CSNS

Chen

Luan

Bao

et al. 2019

Eur. Phys. J. A

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tao Yu

Hydrogen migration as a potential driving force in the thermal decomposition of dimethoxymethane: New insights from pyrolysis imaging photoelectron photoion coincidence spectroscopy and computations

Back-n white neutron facility for nuclear data measurements at CSNS

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

Neutron energy spectrum measurement of the Back-n white neutron source at CSNS

Contact Info

Product

Resources

About