Double-Oracle Sampling Method for Stackelberg Equilibrium Approximation in General-Sum Extensive-Form Games

Karwowski, Jan; Mańdziuk, Jacek

doi:10.1609/aaai.v34i02.5578

Cited by 13 publications

(9 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first approach (referred to as O2UCT -double-oracle UCT sampling) [11,13] relies on a guided sampling of the follower's strategy space interleaved with finding a feasible leader's strategy using double-oracle method.…”

Section: A Summary Of O2uct Methodsmentioning

confidence: 99%

“…δ l must satisfy the following conditions: (1) π r f is the best response strategy against δ l ; (2) δ l provides as high as possible leader's utility when played against the best follower's response. An algorithm of finding the requested leader's strategy δ l is outlined below and detailed in [13].…”

Section: A Summary Of O2uct Methodsmentioning

confidence: 99%

“…At the same time, in reference to large-scale sequential SGs, several algorithms utilizing different techniques, e.g. sequence-form [2], correlated equilibria [4], game abstraction [5], Evolutionary Algorithm [12,34] or Monte Carlo sampling [10,13] which visibly extended the range of tractable SGs, have been proposed recently.…”

Section: Motivationmentioning

confidence: 99%

“…Consequently, modifications to three state-of-the-art methods for solving extensive-form SSGs [4,5,2] which implement AT are proposed. Furthermore, two other non-MILP heuristic methods for solving SSG that rely on Monte Carlo sampling [11,13] and Evolutionary Algorithm [34], respectively are also adequately modified to incorporate AT principles. All five methods are experimentally evaluated on a set of Warehouse Games [10,16].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Anchoring Theory in Sequential Stackelberg Games

Karwowski,

Mańdziuk,

Żychowski

2019

Preprint

Self Cite

View full text Add to dashboard Cite

An underlying assumption of Stackelberg Games (SGs) is perfect rationality of the players. However, in real-life situations (which are often modeled by SGs) the followers (terrorists, thieves, poachers or smugglers) -as humans in general -may act not in a perfectly rational way, as their decisions may be affected by biases of various kinds which bound rationality of their decisions. One of the popular models of bounded rationality (BR) is Anchoring Theory (AT) which claims that humans have a tendency to flatten probabilities of available options, i.e. they perceive a distribution of these probabilities as being closer to the uniform distribution than it really is. This paper proposes an efficient formulation of AT in sequential extensive-form SGs (named ATSG), suitable for Mixed-Integer Linear Program (MILP) solution methods. ATSG is implemented in three MILP/LP-based state-of-the-art methods for solving sequential SGs and two recently introduced non-MILP approaches: one relying on Monte Carlo sampling (O2UCT ) and the other one (EASG) employing Evolutionary Algorithms. Experimental evaluation indicates that both non-MILP heuristic approaches scale better in time than MILP solutions while providing optimal or close-to-optimal solutions. Except for competitive time scalability, an additional asset of non-MILP methods is flexibility of potential BR formulations they are able to incorporate. While MILP approaches accept BR formulations with linear constraints only, no restrictions on the BR form are imposed in either of the two non-MILP methods.

show abstract

Section: A Summary Of O2uct Methodsmentioning

confidence: 99%

Section: A Summary Of O2uct Methodsmentioning

confidence: 99%

Section: Motivationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Anchoring Theory in Sequential Stackelberg Games

Karwowski,

Mańdziuk,

Żychowski

2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…CBK2018 [17]), others employ different ideas. O2UCT [18,19], For instance, utilizes Upper Confidence Bounds applied to trees [20] (a variant of Monte Carlo Tree Search) and combines sampling the follower's strategy space with calculating the best leader's strategy for which a sampled followers strategy is the optimal response. Another heuristic method, EASG [21] maintains a population of candidate leader's strategies and applies specifically designed mutation and crossover operators.…”

Section: State-of-the-art Approachesmentioning

confidence: 99%

Learning Attacker's Bounded Rationality Model in Security Games

Żychowski,

Mańdziuk

2021

Preprint

Self Cite

View full text Add to dashboard Cite

The paper proposes a novel neuroevolutionary method (NESG) for calculating leader's payoff in Stackelberg Security Games. The heart of NESG is strategy evaluation neural network (SENN). SENN is able to effectively evaluate leader's strategies against an opponent who may potentially not behave in a perfectly rational way due to certain cognitive biases or limitations. SENN is trained on historical data and does not require any direct prior knowledge regarding the follower's target preferences, payoff distribution or bounded rationality model. NESG was tested on a set of 90 benchmark games inspired by real-world cybersecurity scenario known as deep packet inspections. Experimental results show an advantage of applying NESG over the existing state-of-the-art methods when playing against not perfectly rational opponents. The method provides high quality solutions with superior computation time scalability. Due to generic and knowledge-free construction of NESG, the method may be applied to various real-life security scenarios.

show abstract