1994
DOI: 10.1109/21.293490
|View full text |Cite
|
Sign up to set email alerts
|

Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
194
0

Year Published

2009
2009
2016
2016

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 272 publications
(197 citation statements)
references
References 16 publications
(4 reference statements)
3
194
0
Order By: Relevance
“…In zero-sum games, the L R-I scheme converges to the equilibrium point if it exists in pure strategies, while the L ReP scheme can arbitrarily close approach a mixed equilibrium (Lakshmivarahan & Narendra, 1981). In general non zero-sum games it is shown that when the automata use a L R-I scheme and the game is such that a unique pure equilibrium point exists, convergence is guaranteed (Sastry et al, 1994). In cases where the game matrix has more than one pure equilibrium, which equilibrium is found depends on the initial conditions.…”
Section: Learning Automata Gamesmentioning
confidence: 99%
See 1 more Smart Citation
“…In zero-sum games, the L R-I scheme converges to the equilibrium point if it exists in pure strategies, while the L ReP scheme can arbitrarily close approach a mixed equilibrium (Lakshmivarahan & Narendra, 1981). In general non zero-sum games it is shown that when the automata use a L R-I scheme and the game is such that a unique pure equilibrium point exists, convergence is guaranteed (Sastry et al, 1994). In cases where the game matrix has more than one pure equilibrium, which equilibrium is found depends on the initial conditions.…”
Section: Learning Automata Gamesmentioning
confidence: 99%
“…Wheeler et al have shown that a set of decentralized learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards (Wheeler & Narendra, 1986). In Sastry, Phansalkar, and Thathachar (1994) it is shown that a team of learning automata involved in a general N-person stochastic game converges to Nash equilibrium if each of the team members makes use of a linear learning algorithm called L R-I algorithms. Nowe and Verbeeck (2002) first introduced the use of interconnected learning automata as a model for stigmergetic communication in multi-agent system to solve MMDPs.…”
Section: Related Workmentioning
confidence: 99%
“…If the cluster accepts a foreign load which extends its queue length to 5, in average half of the local jobs will be delayed. We conclude that the adjustment algorithm is too straightforward, thus the clusters should use more sophisticated techniques for learning their optimal participation level, similar to ones used in other games with incomplete information, such as [31] or genetic algorithms [28].…”
Section: Experimental Analysismentioning
confidence: 99%
“…Among other approaches, including those based on reinforcement learning, maximum-entropy reinforcement learning, smoothed best-response or fictitious play, it is important to highlight the contributions in [3], [7], [8], [18]- [23]. The main drawbacks of these contributions can be summarized in five points: (i) The converging point is a probability distribution over the set of all available channels and power allocations policies [21], [22], [30], [31]. Therefore, the optimization is often on the expectation of the performance metric and the optimality is often claimed in the asymptotic regime.…”
Section: A State Of the Artmentioning
confidence: 99%