Multi-Agent Inverse Reinforcement Learning

Natarajan, Sriraam; Kunapuli, Gautam; Judah, Kshitij; Tadepalli, Prasad; Kersting, Kristian; Shavlik, Jude W.

doi:10.1109/icmla.2010.65

Cited by 62 publications

(41 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, the demonstrations are performed by multiple experts. Contrary to [Natarajan et al, 2010], the experts' policies are not independent from each other but take other experts into account. The methods developed in [Bogert and Doshi, 2014, Bogert et al, 2016 are based on maximum entropy IRL .…”

Section: Irl From Partially Observable Demonstrationsmentioning

confidence: 99%

An Algorithmic Perspective on Imitation Learning

et al. 2018

View full text Add to dashboard Cite

As robots and other intelligent agents move from simple environments and problems to more complex, unstructured settings, manually programming their behavior has become increasingly challenging and expensive. Often, it is easier for a teacher to demonstrate a desired behavior rather than attempt to manually engineer it. This process of learning from demonstrations, and the study of algorithms to do so, is called imitation learning. This work provides an introduction to imitation learning. It covers the underlying assumptions, approaches, and how they relate; the rich set of algorithms developed to tackle the problem; and advice on effective tools and implementation.We intend this paper to serve two audiences. First, we want to familiarize machine learning experts with the challenges of imitation learning, particularly those arising in robotics, and the interesting theoretical and practical distinctions between it and more familiar frameworks like statistical supervised learning theory and reinforcement learning. Second, we want to give roboticists and experts in applied artificial intelligence a broader appreciation for the frameworks and tools available for imitation learning.We organize our work by dividing imitation learning into directly replicating desired behavior (sometimes called behavioral cloning [Bain and Sammut, 1996]) and learning the hidden objectives of the desired behavior from demonstrations (called inverse optimal control [Kalman, 1964] or inverse reinforcement learning [Russell, 1998]). In addition to method analysis, we discuss the design decisions a practitioner must make when selecting an imitation learning approach. Moreover, application examples-such as robots that play table tennis [Kober and Peters, 2009] and programs that play the game of Go [Silver et al., 2016]-illustrate the properties and motivations behind different forms of imitation learning. We conclude by presenting a set of open questions and point towards possible future research directions.

show abstract

Section: Irl From Partially Observable Demonstrationsmentioning

confidence: 99%

An Algorithmic Perspective on Imitation Learning

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Active learning [Settles, 2012] relies on the fact that if an algorithm can only solicit labels of a limited number of examples, then it should choose them judiciously since not all examples provide the same amount of information. Active learning has a long history of being successfully employed with a variety of classifiers such as logistic regression [Lewis and Gale, 1994;Lewis and Catlett, 1994], support vector machines [Tong and Koller, 2001b], Bayesian network learning [Tong and Koller, 2000;2001a] and in sequential decision making tasks such as imitation learning [Judah et al, 2014] and inverse reinforcement learning [Odom and Natarajan, 2016].…”

Section: Related Workmentioning

confidence: 99%

“…The problem of choosing an example to obtain its class label has been addressed as active learning [Settles, 2012]. There have been several extensions of active learning that included presenting a set of features [Raghavan et al, 2006;Druck et al, 2009], or getting labels over clusters [Hofmann and Buhmann, 1998], or preferences [Odom and Natarajan, 2016] or in sequential decision making [Lopes et al, 2009], to name a few.…”

Section: Introductionmentioning

confidence: 99%

On Whom Should I Perform this Lab Test Next? An Active Feature Elicitation Approach

Natarajan¹,

Das

Ramanan

et al. 2018

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

Self Cite

View full text Add to dashboard Cite

We consider the problem of active feature elicitation in which, given some examples with all the features (say, the full Electronic Health Record), and many examples with some of the features (say, demographics), the goal is to identify the set of examples on which more information (say, lab tests) need to be collected. The observation is that some set of features may be more expensive, personal or cumbersome to collect. We propose a classifierindependent, similarity metric-independent, general active learning approach which identifies examples that are dissimilar to the ones with the full set of data and acquire the complete set of features for these examples. Motivated by four real clinical tasks, our extensive evaluation demonstrates the effectiveness of this approach.

show abstract

“…Abbeel et al [11] reported the approach of "Apprenticeship learning" in which the optimal policy is acquired in the process of presuming a reward function. Natarajan et al [12] presumed multiple reward functions in a multiagent environment, and proposed an approach for controlling the behaviors from a global perspective.…”

Section: Inverse Reinforcement Learningmentioning

confidence: 99%

“…Various approaches have been proposed [10], [11], [12]. Ng et al [10] reported an approach for estimating a reward function using linear programming for an environment with finite state space, and the Monte Carlo method for an environment with an infinite state space.…”

Section: Inverse Reinforcement Learningmentioning

confidence: 99%

Encouragement of Right Social Norms by Inverse Reinforcement Learning

Arai

Suzuki²

2014

Journal of Information Processing

View full text Add to dashboard Cite

This study is intended to encourage appropriate social norms among multiple agents. Effective norms, such as those emerging from sustained individual interactions over time, can make agents act cooperatively to optimize their performance. We introduce a "social learning" model in which agents mutually interact under a framework of the coordination game. Because coordination games have dual equilibria, social norms are necessary to make agents converge to a unique equilibrium. As described in this paper, we present the emergence of a right social norm by inverse reinforcement learning, which is an approach for extracting a reward function from the observation of optimal behaviors. First, we let a mediator agent estimate the reward function by inverse reinforcement learning from the observation of a master's behavior. Secondly, we introduce agents who act according to an estimated reward function in the multiagent world in which most agents, called citizens, have no way to act. Finally, we evaluate the effectiveness of introducing inverse reinforcement learning.

show abstract

Multi-Agent Inverse Reinforcement Learning

Cited by 62 publications

References 9 publications

An Algorithmic Perspective on Imitation Learning

An Algorithmic Perspective on Imitation Learning

On Whom Should I Perform this Lab Test Next? An Active Feature Elicitation Approach

Encouragement of Right Social Norms by Inverse Reinforcement Learning

Contact Info

Product

Resources

About