You Were Always on My Mind: Introducing Chef’s Hat and COPPER for Personalized Reinforcement Learning

Barros, Pablo; Bloem, Anne C.; Hootsmans, Inge M.; Opheij, Lena M.; Toebosch, Romain H. A.; Barakova, Emilia; Sciutti, Alessandra

doi:10.3389/frobt.2021.669990

Cited by 2 publications

(6 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our rivalry modulation acts directly on two types of agents: a deep Q-learning (DQL) one and a proximal policy optimization (PPO) one. Both agents were recently adapted and optimized for the Chef's Hat game through the COPPER modulation [21]. COPPER introduces an opponent-specific experience-prioritizing memory used to improve the continual learning capabilities of each agent when playing against known opponents.…”

Section: Proposing Artificial Rivalrymentioning

confidence: 99%

“…The DQL and PPO implementations of the agents were chosen due to their success on learning different strategies [28], and their good performance when playing against human players [21]. Both agents are implemented as COPPER-based agents, and are set to keep learning during all of our experiments.…”

Section: A Chef's Hat Agentmentioning

confidence: 99%

“…At the beginning of the game, each player receives 17 cards, and the player that discards all of them first wins the match. The entire rules of the game and the capability of agents to learn different strategies were recently explored in different studies [18,21,28]. The game state is composed of 28 values, referring to the 17 possible cards each player has in their hands, and 11 cards on the game board.…”

Section: A Chef's Hat Agentmentioning

confidence: 99%

“…Both agents implement the COPPER modulator [21] that expands the prioritize experience replay (PER). Traditional PER can be expressed as: Available at: https://github.com/ pablovin/ChefsHatGYM PER(i) is calculated based on the network's loss after calculating TD in a forward pass of the network (using an input i):…”

Section: Coppermentioning

confidence: 99%

“…We model rivalry as a function of objective factors (such as game performance) and subjective information (such as certain personality traits and competitiveness level), and evaluate our model using the Chef's Hat Card Game [18] To obtain the social features of rivalry and map the intrinsic personality traits arising from human perception of learning agents, we run first an exploratory experiment where human players face artificial agents implemented using deep Q-learning (DQL) [19] and proximal policy optimization (PPO) [20]. Both learning agents are implementing COPPER [21], a continual learning adaptation for Chef's Hat agents. Using questionnaires, we collect how these agents impact the human players in terms of competitiveness, and how humans perceive the social characteristics of these agents.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Incorporating rivalry in reinforcement learning for a competitive game

Barros

Tanevska

Yalçın

et al. 2022

Neural Comput & Applic

Self Cite

View full text Add to dashboard Cite

Recent advances in reinforcement learning with social agents have allowed such models to achieve human-level performance on certain interaction tasks. However, most interactive scenarios do not have performance alone as an end-goal; instead, the social impact of these agents when interacting with humans is as important and largely unexplored. In this regard, this work proposes a novel reinforcement learning mechanism based on the social impact of rivalry behavior. Our proposed model aggregates objective and social perception mechanisms to derive a rivalry score that is used to modulate the learning of artificial agents. To investigate our proposed model, we design an interactive game scenario, using the Chef’s Hat Card Game, and examine how the rivalry modulation changes the agent’s playing style, and how this impacts the experience of human players on the game. Our results show that humans can detect specific social characteristics when playing against rival agents when compared to common agents, which affects directly the performance of the human players in subsequent games. We conclude our work by discussing how the different social and objective features that compose the artificial rivalry score contribute to our results.

show abstract

Section: Proposing Artificial Rivalrymentioning

confidence: 99%

Section: A Chef's Hat Agentmentioning

confidence: 99%