Adapting Strategies to Opponent Models in Incomplete Information Games: A Reinforcement Learning Approach for Poker

Teófilo, Luís Filipe; Passos, Nuno; Reis, Luís Paulo; Cardoso, Henrique Lopes

doi:10.1007/978-3-642-31368-4_26

Cited by 13 publications

(4 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In addition, opponent modeling can be adopted in multi-agent reinforcement learning problems where RL agents are designed to consider the learning of other agents in the environment when updating their own policies [36]. Another promising solution is to mimic human players by combining opponent models used by expert players and reinforcement learning [37]. All the above works demonstrate that combining opponent modeling with reinforcement learning is beneficial to achieve performance gain in multi-agent imperfect-information games, which also inspires this work.…”

Section: B Opponent Modeling For Gamesmentioning

confidence: 99%

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Zhao¹,

Zhao²,

Xunhan³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into DouZero, our DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.

show abstract

Section: B Opponent Modeling For Gamesmentioning

confidence: 99%

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Zhao¹,

Zhao²,

Xunhan³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Outro exemplo AKI-REALBOT onde foi utilizado o algoritmo de Monte-Carlo [5] de forma a explorar a estratégia do adversário. Outras técnicas utilizadas foram por exemplo a criação de agentes baseados em aprendizagem por reforço [6].…”

Section: Trabalho Relacionadounclassified

Poker learner: Players modeling through data-mining

Silva

Reis

2015

2015 10th Iberian Conference on Information Systems and Technologies (CISTI)

View full text Add to dashboard Cite

Resumo -Ao longo dos últimos anos, o interesse dos investigadores da área de inteligência artificial no poker tem crescido significativamente. Ao contrário dos jogos de tabuleiro, o poker é um jogo de informação incompleta tornando-se assim um jogo bastante complexo para um agente virtual. Este trabalho tem como objetivo a criação de um modelo de dados que após a aplicação de técnicas de data mining permita obter um modelo de jogador de poker (fase de pré-flop). Será utilizada uma base dados de um jogador profissional onde os dados se encontram em ficheiros de texto. Ao longo do trabalho será utilizado o CRISP-DM, e serão efetuadas as suas diversas fases. Como ferramenta de ETL (Extract, Transform and Load) foi utilizado o Talend e para a execução de técnicas de data mining foi utilizado o software Weka. Como resultado final, foi conseguido um modelo de jogador com uma curva ROC de muito boa qualidade. Este resultado permite concluir que a abordagem está no caminho certo para permitir criar modelos completos de jogadores de poker. Palavras Chaveaprendizagem computacional, poker, jogos, mineração de dados, talend, Weka, CRISP-DM.Abstract -In recent years the game of poker has created a high interest on researchers from the artificial intelligence area. Unlike board games, poker is an incomplete information game becoming a very complex game for a virtual agent. The main objective of this work is to create a data model enabling to apply data mining techniques to obtain a poker player model (pre-flop stage). To do that we used a database from a professional poker player where the data is stored in text files. The work used CRISP-DM, performing its stages. As ETL (Extract, Transform and Load) tool Talend was used and for running the data mining techniques Weka was used. As a final result, a player model was achieved with a very good ROC curve. This result, enable us to conclude that the approach is adequate for creating complete poker player models.

show abstract

“…In [12], instead of the average, the degree of similarity was measured through the Euclidean distance between the game features. The Monte Carlo Search Tree algorithm [14] and reinforcement learning approaches [15] are other techniques that were successfully applied to the domain of Computer Poker. One should not also forget some work done in opponent modeling techniques, namely [16].…”

Section: One Great Breakthrough In the Domain Of Computer Poker And Other Extensive-form Games Research Was The Development Of The Countementioning

confidence: 99%

A Profitable Online No-Limit Poker Playing Agent

Teófilo

Reis

Cardoso

2014

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

Self Cite

View full text Add to dashboard Cite

The No-Limit Texas Hold'em variant of Poker is the game that is most frequently used to assess new developments in incomplete information problems, through the development of game playing agents. For this particular game, current state-ofthe-art techniques consist in the pre-computation of a set of strategies that are in a Nash-Equilibrium state. However, due to the game's decision tree size, current algorithms only work in an abstracted version of No-Limit Poker. Moreover, since these strategies are static, they ignore the opponents' playing style thus being unable to maximize profit against certain kinds of opponents. This makes these strategies unusable when playing in an online environment against human players. In this paper we present a rule-based strategy approach for a No-Limit Poker agent that was developed to play online, against human players and in online multiplayer matches. This strategy is based on a popular technique used by human playersshort stack playingwhich consists of playing in tables with up to 6 players and low initial resources. Using domain specific opponent modeling techniques and limiting the decisions to the first round of the game, the agent was able to make a good profit margin of 11.5% per game when playing against human players. The significance of our results resides in the fact that, for the first time in the Computer Poker literature, we present a game playing agent that can match human players in multiplayer games.

show abstract

Adapting Strategies to Opponent Models in Incomplete Information Games: A Reinforcement Learning Approach for Poker

Cited by 13 publications

References 6 publications

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Poker learner: Players modeling through data-mining

A Profitable Online No-Limit Poker Playing Agent

Contact Info

Product

Resources

About