Multi-agent cooperation by reinforcement learning with teammate modeling and reward allotment

Zhou, Pucheng; Huang, Shen

doi:10.1109/fskd.2011.6019729

Cited by 6 publications

(7 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As an example, TM-LM-ASM (team-mate model-learning model-Action Selection Model) [30] is a JAL method that combines traditional Q-learning with a team-mate modeling mechanism. To do that, each learner has to memorize a If we examine the case of 4 agents which can make 5 actions in a 10 × 10 grid world, we obtain:…”

Section: Joint State / Joint Actionmentioning

confidence: 99%

“…As explained earlier, the JAG method [18] ensures a global coordination while using independent learners but needs a centralized process to make sure that all agents choose the same joint action at each learning step, whereas the TM-LM-ASM method [30] is a full distributed learning approach that also provides a global coordination but employs joint action learners which make it unsuitable for systems considering many agents and/or large state spaces. Our objective is to develop a new intermediate approach between TM-LM-ASM and JAG.…”

Section: Proposed Reinforcement Learning Algorithmmentioning

confidence: 99%

“…For a better selection of the next action to execute, a generalization of the action selection model proposed by Zhou and Shen [30] for 2 agents to N agents (N >= 2) is possible by exploiting all stored P-tables and Q-tables of the learner. To this end, for each agent i having (N agents −1) team-mates and staying at a joint state s = (s1, ..., s i , ..., s Nagents ), the action a * i to be executed in that state s can be selected by using (4):…”

Section: The Action Selection Modelmentioning

confidence: 99%

See 2 more Smart Citations

A New Distributed Reinforcement Learning Approach for Multiagent Cooperation Using Team-mate Modeling and Joint Action Generalization

Zemzem¹,

Hosni²

2020

Adv. sci. technol. eng. syst. j.

View full text Add to dashboard Cite

This paper focuses on the issue of distributed reinforcement learning (RL) for decision-making in cooperative multi-agent systems. Although this problem has been a topic of interest to many researchers, results obtained from these works aren't sufficient and several difficulties have not yet been resolved, such as, the curse of dimensionality and the multi-agent coordination problem. These issues are aggravated exponentially as the number of agents or states increases, resulting in large memory requirement, slowness in learning speed, coordination failure and even no system convergence. As a solution, a new distributed RL algorithm, called the ThMLA-JAG method, is proposed here. Its main idea is to decompose the coordination of all agents into several two-agent coordination and to use a teammate model for managing other agents' experiences. Validation tests on a pursuit game show that the proposed method overcomes the aforementioned limitations and is a good alternative to RL methods when dealing with cooperative learning in dynamics environments while avoiding collisions with obstacles and other learners.

show abstract

Section: Joint State / Joint Actionmentioning

confidence: 99%

Section: Proposed Reinforcement Learning Algorithmmentioning

confidence: 99%

Section: The Action Selection Modelmentioning

confidence: 99%

See 1 more Smart Citation

A New Distributed Reinforcement Learning Approach for Multiagent Cooperation Using Team-mate Modeling and Joint Action Generalization

Zemzem¹,

Hosni²

2020

Adv. sci. technol. eng. syst. j.

View full text Add to dashboard Cite

show abstract

“…Every spot, visited by a vehicle, is marked to reduce the reward for visiting that spot again to prevent redundant work. In [18], estimation of teammates behavior for MARL is proposed for communication constrained scenarios by having each agent store a teammate model for all of its teammates and continuously update this model. Deep Learning is used for collision avoidance and path planning in [16] for noncommunicating agents.…”

Section: Introductionmentioning

confidence: 99%

Uw-Marl

Rahmati

Nadeem

Sadhu

et al. 2019

Proceedings of the International Conference on Underwater Networks &Amp; Systems

View full text Add to dashboard Cite

Near-real-time water-quality monitoring in uncertain environments such as rivers, lakes, and water reservoirs of different variables is critical to protect the aquatic life and to prevent further propagation of the potential pollution in the water. In order to measure the physical values in a region of interest, adaptive sampling is helpful as an energy-and time-efficient technique since an exhaustive search of an area is not feasible with a single vehicle. We propose an adaptive sampling algorithm using multiple autonomous vehicles, which are well-trained, as agents, in a Multi-Agent Reinforcement Learning (MARL) framework to make efficient sequence of decisions on the adaptive sampling procedure. The proposed solution is evaluated using experimental data, which is fed into a simulation framework. Experiments were conducted in the Raritan River, Somerset and in Carnegie Lake, Princeton, NJ during July 2019.

show abstract

“…Spychalski and Arendt proposed a methodology for implementing machine learning capability in multi-agent systems for aided design of selected control systems allowed to improve their performance by reducing the time spent processing requests that were previously acknowledged and stored in the learning module [12]. In [13], a new kind of multi-agent reinforcement learning algorithm, called TM_Qlearning, which combines traditional Q-learning with observation-based teammate modeling techniques, was proposed. Two multi-agent reinforcement learning methods, both consisting of promoting the selection of actions so that the chosen action not only relies on the present experience but also on an estimation of possible future ones, have been proposed to better solve the coordination problem and the exploration/exploitation dilemma in the case of nonstationary environments [14].…”

Section: Introductionmentioning

confidence: 99%

Improvement of Cooperative Action for Multi-Agent System by Rewards Distribution

Xie¹

2019

Assistive and Rehabilitation Engineering

View full text Add to dashboard Cite

The frequency of natural disasters is increasing everywhere in the world, which is a major impediment to sustainable development. One important issue for the international community is to reduce vulnerability to and damage from disasters. In addition, a large number of injuries occur simultaneously in a large-scale disaster, and the condition of the injured will change over time. Efficient rescue activities are carried out using triage to determine the priority of injury treatment based on the severity of the persons' conditions. In this chapter, we discuss acquiring cooperative behavior of rescuing the injured and clearing obstacles according to triage of the injured in a multi-agent system. We propose three methods of reward distribution: (1) reward distribution responding to the condition of the injured, (2) reward distribution based on the contribution degree, and (3) reward distribution by the contribution degree responding to the condition of the injured. We investigated the effectiveness of the three proposed methods for a disaster relief problem by an experiment. The results of the experiment showed that agents gained high rewards by rescuing those in most urgent need under the method having the reward distributed according to the contribution degree responding to the condition of the injured.

show abstract

Multi-agent cooperation by reinforcement learning with teammate modeling and reward allotment

Cited by 6 publications

References 10 publications

A New Distributed Reinforcement Learning Approach for Multiagent Cooperation Using Team-mate Modeling and Joint Action Generalization

A New Distributed Reinforcement Learning Approach for Multiagent Cooperation Using Team-mate Modeling and Joint Action Generalization

Uw-Marl

Improvement of Cooperative Action for Multi-Agent System by Rewards Distribution

Contact Info

Product

Resources

About