A Survey of Opponent Modeling in Adversarial Domains

Nashed, Samer B.; Zilberstein, Shlomo

doi:10.1613/jair.1.12889

Cited by 16 publications

(8 citation statements)

References 113 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our work is related to these in the sense that we assume opponent models to be given (called "type-based reasoning" by Albrecht and Stone (2018, Section 4.2)). However, an important stream of work also studies the learning of opponent models; we refer the reader to the survey by Nashed and Zilberstein (2022).…”

Section: Related Workmentioning

confidence: 99%

Opponent-Model Search in Games with Incomplete Information

Li,

Zanuttini,

Ventos

2024

AAAI

View full text Add to dashboard Cite

Games with incomplete information are games that model situations where players do not have common knowledge about the game they play, e.g. card games such as poker or bridge. Opponent models can be of crucial importance for decision-making in such games. We propose algorithms for computing optimal and/or robust strategies in games with incomplete information, given various types of knowledge about opponent models. As an application, we describe a framework for reasoning about an opponent's reasoning in such games, where opponent models arise naturally.

show abstract

Section: Related Workmentioning

confidence: 99%

Opponent-Model Search in Games with Incomplete Information

Li,

Zanuttini,

Ventos

2024

AAAI

View full text Add to dashboard Cite

show abstract

“…Opponent modeling is the problem of estimating the properties of an opponent (Nashed & Zilberstein, 2022). Much previous work on this topic has been done in imperfect information games like poker (Billings et al, 1998;Bard et al, 2015Bard et al, , 2013Davis et al, 2014), but this work focuses on strategic characteristics and limitations of the opponents, and the domains do not include execution uncertainty.…”

Section: Opponent Modelingmentioning

confidence: 99%

Estimating Agent Skill in Continuous Action Domains

Archibald,

Nieves-Rivera

2024

jair

View full text Add to dashboard Cite

Actions in most real-world continuous domains cannot be executed exactly. An agent’s performance in these domains is influenced by two critical factors: the ability to select effective actions (decision-making skill), and how precisely it can execute those selected actions (execution skill). This article addresses the problem of estimating the execution and decision-making skill of an agent, given observations. Several execution skill estimation methods are presented, each of which utilize different information from the observations and make assumptions about the agent’s decision-making ability. A final novel method forgoes these assumptions about decision-making and instead estimates the execution and decision-making skills simultaneously under a single Bayesian framework. Experimental results in several domains evaluate the estimation accuracy of the estimators, especially focusing on how robust they are as agents and their decision-making methods are varied. These results demonstrate that reasoning about both types of skill together significantly improves the robustness and accuracy of execution skill estimation. A case study is presented using the proposed methods to estimate the skill of Major League Baseball pitchers, demonstrating how these methods can be applied to real-world data sources.

show abstract

“…Policy reconstruction methods predict agent policies from environment observations [2], which has been shown to be beneficial in collaborative [14], competitive [3] and mixed settings [15]. Deep Reinforcement Opponent Modelling (DRON) was one of the first works combining deep RL (DRL) with opponent modelling [16].…”

Section: A Opponent Modelling In Drlmentioning

confidence: 99%

“…, by freezing a copy of a PPO [9] agent trained under δ = 0-Uniform self-play [6] after 200k, 400k and 600k episodes respectively. [0,1], [1,2], [2,3], [3,4] Table I shows the hyperparameters used to train the test agents. No formal hyperparameter sweep was conducted, and the final values were chosen after a few manual trials.…”

Section: A Training and Benchmarking Opponentsmentioning

confidence: 99%

“…Reinforcement learning has been successfully applied in increasingly challenging settings, with multi-agent reinforcement learning being one of the frontiers that pose open problems [1], [2], [3]. Finding strong policies for multi-agent interactions (games) requires techniques to selectively explore the space of best response policies, and techniques to learn such best response policies from gameplay.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

BRExIt: On Opponent Modelling in Expert Iteration

Hernández¹,

Baier²,

Kaisers³

2022

Preprint

View full text Add to dashboard Cite

Finding a best response policy is a central objective in game theory and multi-agent learning, with modern population-based training approaches employing reinforcement learning algorithms as best-response oracles to improve play against candidate opponents (typically previously learnt policies). We propose Best Response Expert Iteration (BRExIt), which accelerates learning in games by incorporating opponent models into the state-of-the-art learning algorithm Expert Iteration (ExIt). BRExIt aims to (1) improve feature shaping in the apprentice, with a policy head predicting opponent policies as an auxiliary task, and (2) bias opponent moves in planning towards the given or learnt opponent model, to generate apprentice targets that better approximate a best response. In an empirical ablation on BRExIt's algorithmic variants in the game Connect4 against a set of fixed test agents, we provide statistical evidence that BRExIt learns well-performing policies with greater sample efficiency than ExIt.

show abstract

A Survey of Opponent Modeling in Adversarial Domains

Cited by 16 publications

References 113 publications

Opponent-Model Search in Games with Incomplete Information

Opponent-Model Search in Games with Incomplete Information

Estimating Agent Skill in Continuous Action Domains

BRExIt: On Opponent Modelling in Expert Iteration

Contact Info

Product

Resources

About