Regret Circuits: Composability of Regret Minimizers

Farina, Gabriele; Kroer, Christian; Sandholm, Tuomas

doi:10.48550/arxiv.1811.02540

Cited by 2 publications

(3 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…All three seem to rely on the same fundamental building block of better understanding the behavior of no-regret learners whose rewards are determined by (asynchronous) observations of other no-regret learners. Some recent progress along these lines has been made [20,33], but more work is needed.…”

Section: Discussionmentioning

confidence: 99%

“…CFR algorithms remain an active topic of research; recent work has shown how to combine it with function approximation [48,41,30,11,36], improve the convergence rate in certain settings [19], and apply it to more complex structures [20]. Most relevant to our work, examples are known where CFR fails to converge to the correct policy without perfect recall [35].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Combining No-regret and Q-learning

Kash¹,

Sullins²,

Hofmann³

2019

Preprint

View full text Add to dashboard Cite

Counterfactual Regret Minimization (CFR) has found success in settings like poker which have both terminal states and perfect recall. We seek to understand how to relax these requirements. As a first step, we introduce a simple algorithm, local no-regret learning (LONR), which uses a Q-learning-like update rule to allow learning without terminal states or perfect recall. We prove its convergence for the basic case of MDPs (and limited extensions of them) and present empirical results showing that it achieves last iterate convergence in a number of settings, most notably NoSDE games, a class of Markov games specifically designed to be challenging to learn where no prior algorithm is known to achieve convergence to a stationary equilibrium even on average.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Combining No-regret and Q-learning

Kash¹,

Sullins²,

Hofmann³

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Extensive-form games in which players' strategy sets can depend on other players' actions have been studied by Davis, Waugh, and Bowling [19] assuming payo s are bilinear, and by Farina, Kroer, and Sandholm [27] for another speci c class of convex-concave payo s.…”

Section: Iteration Complexitymentioning

confidence: 99%

Convex-Concave Min-Max Stackelberg Games

Denizalp¹,

Greenwald²

2021

Preprint

View full text Add to dashboard Cite

Min-max optimization problems (i.e., min-max games) have been attracting a great deal of attention because of their applicability to a wide range of machine learning problems. Although signi cant progress has been made recently, the literature to date has focused on games with independent strategy sets; little is known about solving games with dependent strategy sets, which can be characterized as minmax Stackelberg games. We introduce two rst-order methods that solve a large class of convex-concave min-max Stackelberg games, and show that our methods converge in polynomial time. Min-max Stackelberg games were rst studied by Wald, under the posthumous name of Wald's maximin model, a variant of which is the main paradigm used in robust optimization, which means that our methods can likewise solve many convex robust optimization problems. We observe that the computation of competitive equilibria in Fisher markets also comprises a min-max Stackelberg game. Further, we demonstrate the e cacy and e ciency of our algorithms in practice by computing competitive equilibria in Fisher markets with varying utility structures. Our experiments suggest potential ways to extend our theoretical results, by demonstrating how di erent smoothness properties can a ect the convergence rate of our algorithms.Preprint. Under review.

show abstract

Regret Circuits: Composability of Regret Minimizers

Cited by 2 publications

References 13 publications

Combining No-regret and Q-learning

Combining No-regret and Q-learning

Convex-Concave Min-Max Stackelberg Games

Contact Info

Product

Resources

About