Learning Coordination in Multi-Agent Systems Using Influence Value Reinforcement Learning

Barrios-Aranibar, Dennis; Gonçalves, Luiz Marcos Garcia

doi:10.1109/isda.2007.136

Cited by 7 publications

(8 citation statements)

References 14 publications

(6 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As shown in figure 4, when exploration rate increases the Independent Learning Algorithm looses the capability of convergence to positions (1,3) and (3,3).…”

Section: Resultsmentioning

confidence: 98%

“…If they reach these final positions at the same time, they obtain a positive reward. When they reach the position (1, 3) at the same time they obtain 5 points and when they reach the position (3,3) (1,2), and finally the action go down leads the agent to position (3,2).…”

Section: Resultsmentioning

confidence: 99%

“…Our paradigm is based on social interaction of people [1]. When two persons interact, they communicate to each other what they think about their actions.…”

Section: Influence Value Reinforcement Learningmentioning

confidence: 99%

“…In previous work [1], authors shown that this paradigm outperforms traditional ones (independent learning and joint action learning) in general sum repetitive games. In this work, authors extend this model for application in two agent stochastic games and compare its performance with traditional paradigms in general sum games.…”

Section: Introductionmentioning

confidence: 95%

See 3 more Smart Citations

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

Barrios-Aranibar¹,

Gonçalves²

2007

7th International Conference on Hybrid Intelligent Systems (HIS 2007)

Self Cite

View full text Add to dashboard Cite

In this work authors extend the model of the reinforcement learning paradigm for multi-agent systems called "Influence Value Reinforcement Learning" (IVRL). In previous work an algorithm for repetitive games was proposed, and it outperformed traditional paradigms. Here, authors define an algorithm based on this paradigm for using when agents has to learn from delayed rewards, thus, an influence value reinforcement learning algorithm for two agents stochastic games. The IVRL paradigm is based on social interaction of people, specially in the fact that people communicate each other what they think about their actions and this opinion has some influence in the behavior of each other. A modified version of Q-Learning algorithm using this paradigm was constructed. The so called IVQ-Learning algorithm was implemented and compared with versions of Q-Learning for independent learning and joint action learning. Our approach shows to have more probability to converge to an optimal equilibrium than IQ-Learning and JAQ-Learning algorithms, specially when exploration increases.

show abstract

“…As shown in figure 4, when exploration rate increases the Independent Learning Algorithm looses the capability of convergence to positions (1,3) and (3,3).…”

Section: Resultsmentioning

confidence: 98%

Section: Resultsmentioning

confidence: 99%

“…Our paradigm is based on social interaction of people [1]. When two persons interact, they communicate to each other what they think about their actions.…”

Section: Influence Value Reinforcement Learningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 95%

See 2 more Smart Citations

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

Barrios-Aranibar¹,

Gonçalves²

2007

7th International Conference on Hybrid Intelligent Systems (HIS 2007)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Uma questão crítica quando modelando o controle do despacho de grupos de elevadores para aprendizagem por reforçoé a explosão do espaço de estados. Um caminho trilhado para reduzir esta dificuldadeé modelar o grupo de elevadores como uma Sistema Multi-Agente (SMA) [Barrios- Aranibar and Gonçalves 2007]. Essa abordagem geralmente resulta em reduzir o armazenamento (espaço de estados) para cada agente, porém sacrificando em muito a curva velocidade de aprendizado [R. S. Sutton and A. G. Barto 2000].…”

Section: Introductionunclassified

A solution for the Elevators Group Dispatch by Multiagent Reinforcement Learning

Memória¹,

Maia²

2019

Anais Do XVI Encontro Nacional De Inteligência Artificial E Computacional (ENIAC 2019)

View full text Add to dashboard Cite

In this work, a modeling and algorithm based on multiagent reinforcement learning is developed for the problem of elevator group dispatch. The main advantage is that, along with the function approximation, this multi-agent solution leads to reduction of the state space, allowing complex states to be addressed with a synthesizing evaluation function. Each elevator is considered an agent that have to decide about two actions: answer or ignore the new call. With some iterations, the agents learn the weights of an evaluation function which approximate the state-action value function. The performance of solution (average waiting time - AWT), shown varying the traffic pattern, flow of people, number of elevators and number of floors, is comparable to other current proposals reported in the literature.

show abstract

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

Barrios-Aranibar

Gonçalves

2007

7th International Conference on Hybrid Intelligent Systems (HIS 2007)

View full text Add to dashboard Cite

Learning Coordination in Multi-Agent Systems Using Influence Value Reinforcement Learning

Cited by 7 publications

References 14 publications

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

A solution for the Elevators Group Dispatch by Multiagent Reinforcement Learning

Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

Contact Info

Product

Resources

About