2000
DOI: 10.1111/1468-0262.00153
|View full text |Cite
|
Sign up to set email alerts
|

A Simple Adaptive Procedure Leading to Correlated Equilibrium

Abstract: We propose a new and simple adaptive procedure for playing a game: ''regret-matching.'' In this procedure, players may depart from their current play with probabilities that are proportional to measures of regret for not having used other strategies in the past. It is shown that our adaptive procedure guarantees that, with probability one, the empirical distributions of play converge to the set of correlated equilibria of the game.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

7
566
0
1

Year Published

2006
2006
2017
2017

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 872 publications
(574 citation statements)
references
References 22 publications
7
566
0
1
Order By: Relevance
“…[20] for the corresponding property in finite games; Sandholm [27] calls this class 'stable games'). Note first that linear quadratic games like f (x, y) = −x 2 + axy satisfy condition (16) if and only if a ≤ 0 as one can easily check. Furthermore, every symmetric zero-sum game satisfies condition (16).…”
Section: Negative Semi-definite Gamesmentioning
confidence: 99%
See 3 more Smart Citations
“…[20] for the corresponding property in finite games; Sandholm [27] calls this class 'stable games'). Note first that linear quadratic games like f (x, y) = −x 2 + axy satisfy condition (16) if and only if a ≤ 0 as one can easily check. Furthermore, every symmetric zero-sum game satisfies condition (16).…”
Section: Negative Semi-definite Gamesmentioning
confidence: 99%
“…Note first that linear quadratic games like f (x, y) = −x 2 + axy satisfy condition (16) if and only if a ≤ 0 as one can easily check. Furthermore, every symmetric zero-sum game satisfies condition (16). By definition of a symmetric zero-sum game, f (x, y) + f (y, x) = 0 for all x, y ∈ S. This implies that E(P, Q) +E(Q, P ) = 0, and in particular E(P, P ) = 0.…”
Section: Negative Semi-definite Gamesmentioning
confidence: 99%
See 2 more Smart Citations
“…Throughout the paper we use this framework in which the algorithm has a limited access to the losses. For example, in the so called multi-armed bandit problem the algorithm has only information on the loss of the chosen expert, and no information is available about the loss it would have suffered had it made a different decision (see, e.g., Auer et al [1], Hart and Mas Colell [13]). Another example is label efficient prediction, where it is expensive to obtain the losses of the experts, and therefore the algorithm has the option to query this information (see Cesa-Bianchi et.…”
Section: Introductionmentioning
confidence: 99%