How to Make the Perfect Fireworks Display: Two Strategies for<i>Hanabi</i>

Cox, C.H.; Silva, Jessica De; DeOrsey, Philip; Kenter, Franklin; Retter, Troy; Tobin, Josh

doi:10.4169/math.mag.88.5.323

Cited by 28 publications

(16 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By combining evolution, new rules and specialized behavior, we get a improvement over the best purely rule-based agents, going from 18.16 to 19.32. While hat agents [13], [14] score significantly better than our mirror agents, they are unsuited for mixed or human play. To our knowledge, the only published non-hat agent that exceeds our score is the combination of Tree Search with a rule-based agent as evaluator, seen in [14], with a score of 20.22 across all game sizes.…”

Section: Discussionmentioning

confidence: 75%

Evolving Agents for the Hanabi 2018 CIG Competition

Canaan

Shen

Torrado

et al. 2018

2018 IEEE Conference on Computational Intelligence and Games (CIG)

View full text Add to dashboard Cite

Hanabi is a cooperative card game with hidden information that has won important awards in the industry and received some recent academic attention. A two-track competition of agents for the game will take place in the 2018 CIG conference. In this paper, we develop a genetic algorithm that builds rulebased agents by determining the best sequence of rules from a fixed rule set to use as strategy. In three separate experiments, we remove human assumptions regarding the ordering of rules, add new, more expressive rules to the rule set and independently evolve agents specialized at specific game sizes. As result, we achieve scores superior to previously published research for the mirror and mixed evaluation of agents.

show abstract

Section: Discussionmentioning

confidence: 75%

Evolving Agents for the Hanabi 2018 CIG Competition

Canaan

Shen

Torrado

et al. 2018

2018 IEEE Conference on Computational Intelligence and Games (CIG)

View full text Add to dashboard Cite

show abstract

“…Our SmartBot results only risks lives in the two player setting. HatBot [40] and WTFWThat [41]. HatBot uses a technique often seen in coding theory and "hat puzzles".…”

Section: Rule-based Approachesmentioning

confidence: 99%

The Hanabi challenge: A new frontier for AI research

Bard¹,

Foerster

Chandar

et al. 2020

Artificial Intelligence

157

128

View full text Add to dashboard Cite

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques. 6 One such equilibrium occurs when players do not intentionally communicate information to other players, and ignore what other players tell them (historically called a pooling equilibrium in pure signalling games [15], or a babbling equilibrium in later work using cheap talk [16]). In this case, there is no incentive for a player to start communicating because they will be ignored, and there is no incentive to pay attention to other players because they are not communicating.7 In pure signalling games where actions are purely communicative, policies are often referred to as communication protocols. Though Hanabi is not such a pure signalling game, when we want to emphasize the communication properties of an agent's policy we will still refer to its communication protocol. 8 We use the word convention to refer to the parts of a communication protocol or policy that interrelate. Technically, these can be thought of as constraints on the policy to enact the convention.

show abstract

“…Research on AI agents playing Hanabi has been widely conducted in recent years (Osawa, 2015;Cox et al, 2015;Walton-Rivers et al, 2019;Sato and Osawa, 2019;Bard et al, 2020). Hanabi is a game where the results tend to differ depending on the combination of the teammate's strategy and your own.…”

Section: Background Of Hanabi Study: a Unique Testbed For Analyzing Human Cooperationmentioning

confidence: 99%

“…A game played between similar agents is suitable for obtaining a theoretical solution. One of the most famous studies examining Hanabi's theoretical solutions was the work of Cox et al (2015). They took Hanabi's problem as a hat guessing task (Butler et al, 2009) and found that they got an average score of 24.7 in a five player game.…”

Section: Background Of Hanabi Study: a Unique Testbed For Analyzing Human Cooperationmentioning

confidence: 99%

Emergence of Cooperative Impression With Self-Estimation, Thinking Time, and Concordance of Risk Sensitivity in Playing Hanabi

et al. 2021

View full text Add to dashboard Cite

The authors evaluate the extent to which a user’s impression of an AI agent can be improved by giving the agent the ability of self-estimation, thinking time, and coordination of risk tendency. The authors modified the algorithm of an AI agent in the cooperative game Hanabi to have all of these traits, and investigated the change in the user’s impression by playing with the user. The authors used a self-estimation task to evaluate the effect that the ability to read the intention of a user had on an impression. The authors also show thinking time of an agent influences impression for an agent. The authors also investigated the relationship between the concordance of the risk-taking tendencies of players and agents, the player’s impression of agents, and the game experience. The results of the self-estimation task experiment showed that the more accurate the estimation of the agent’s self, the more likely it is that the partner will perceive humanity, affinity, intelligence, and communication skills in the agent. The authors also found that an agent that changes the length of thinking time according to the priority of action gives the impression that it is smarter than an agent with a normal thinking time when the player notices the difference in thinking time or an agent that randomly changes the thinking time. The result of the experiment regarding concordance of the risk-taking tendency shows that influence player’s impression toward agents. These results suggest that game agent designers can improve the player’s disposition toward an agent and the game experience by adjusting the agent’s self-estimation level, thinking time, and risk-taking tendency according to the player’s personality and inner state during the game.

show abstract

How to Make the Perfect Fireworks Display: Two Strategies forHanabi

Cited by 28 publications

References 4 publications

Evolving Agents for the Hanabi 2018 CIG Competition

Evolving Agents for the Hanabi 2018 CIG Competition

The Hanabi challenge: A new frontier for AI research

Emergence of Cooperative Impression With Self-Estimation, Thinking Time, and Concordance of Risk Sensitivity in Playing Hanabi

Contact Info

Product

Resources

About