Real-World Human-Robot Collaborative Reinforcement Learning

Shafti, Ali; Tjomsland, Jonas; Dudley, William; Faisal, A. Aldo

doi:10.1109/iros45743.2020.9341473

Cited by 13 publications

(11 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Agents trained using deep reinforcement learning (DRL; i.e., the integration of reinforcement learning with deep neural networks), for example, have been successful in discovering adaptive behavior and strategies in individual [ 59 ] and group task contexts [ 60 , 61 ]. Within the context of working with humans in collaborative tasks, such agents can develop control policies that are either user-specific [ 62 ] or generalize to a distribution of human strategies during training [ 63 ]. By giving meaning to actions with the use of reward functions [ 64 ], black-box self-supervised approaches have the ability to provide a “direct fit” [ 65 ] between an agent and task-relevant states–assuming there is sufficient sampling of the task environment.…”

Section: Discussionmentioning

confidence: 99%

Task dynamics define the contextual emergence of human corralling behaviors

et al. 2021

View full text Add to dashboard Cite

Social animals have the remarkable ability to organize into collectives to achieve goals unobtainable to individual members. Equally striking is the observation that despite differences in perceptual-motor capabilities, different animals often exhibit qualitatively similar collective states of organization and coordination. Such qualitative similarities can be seen in corralling behaviors involving the encirclement of prey that are observed, for example, during collaborative hunting amongst several apex predator species living in disparate environments. Similar encirclement behaviors are also displayed by human participants in a collaborative problem-solving task involving the herding and containment of evasive artificial agents. Inspired by the functional similarities in this behavior across humans and non-human systems, this paper investigated whether the containment strategies displayed by humans emerge as a function of the task’s underlying dynamics, which shape patterns of goal-directed corralling more generally. This hypothesis was tested by comparing the strategies naïve human dyads adopt during the containment of a set of evasive artificial agents across two disparate task contexts. Despite the different movement types (manual manipulation or locomotion) required in the different task contexts, the behaviors that humans display can be predicted as emergent properties of the same underlying task-dynamic model.

show abstract

Section: Discussionmentioning

confidence: 99%

Task dynamics define the contextual emergence of human corralling behaviors

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Indeed, a substantial limitation or impediment to the development of DPMP models is that they require researchers and modelers to have a thorough a-priori understanding of the task dynamics that ensure task success, which often involves a significant amount of experimental research and data-driven optimization. In contrast, DRL methods do not require any prior knowledge of the dynamics of the task environment or the agent’s actions, and thus show great promise in developing flexible strategies and interaction couplings that are either user-specific (Shafti et al, 2020) or can generalize to a diversity of users (Carroll et al, 2019). However, it is also the case that DRL methods are often notoriously slow and computationally expensive, thus restricting their use in cases of applications with sparse rewards or which lack access to datasets of desired behavior.…”

Section: Discussionmentioning

confidence: 99%

“…Atari 2600 games (Bellemare et al, 2012), DOTA (Berner et al, 2019), Starcraft II (Vinyals et al, 2019), with such video games serving as the benchmark for DRL testing and development. More recently, research has also demonstrated how DRL agents can extend beyond simulated environments to achieve successful multiagent performance in physical system tasks (Shafti et al, 2020; Morgan et al, 2021).…”

Section: Deep Reinforcement Learningmentioning

confidence: 99%

A Comparison of Dynamical Perceptual-Motor Primitives and Deep Reinforcement Learning for Human-Artificial Agent Training Systems

Rigoli

Patil

Nalepka

et al. 2022

Journal of Cognitive Engineering and Decision Making

View full text Add to dashboard Cite

Effective team performance often requires that individuals engage in team training exercises. However, organizing team-training scenarios presents economic and logistical challenges and can be prone to trainer bias and fatigue. Accordingly, a growing body of research is investigating the effectiveness of employing artificial agents (AAs) as synthetic teammates in team training simulations, and, relatedly, how to best develop AAs capable of robust, human-like behavioral interaction. Motivated by these challenges, the current study examined whether task dynamical models of expert human herding behavior could be embedded in the control architecture of AAs to train novice actors to perform a complex multiagent herding task. Training outcomes were compared to human-expert trainers, novice baseline performance, and AAs developed using deep reinforcement learning (DRL). Participants’ subjective preferences for the AAs developed using DRL or dynamical models of human performance were also investigated. The results revealed that AAs controlled by dynamical models of human expert performance could train novice actors at levels equivalent to expert human trainers and were also preferred over AAs developed using DRL. The implications for the development of AAs for robust human-AA interaction and training are discussed, including the potential benefits of employing hybrid Dynamical-DRL techniques for AA development.

show abstract

“…( Nikolaidis et al, 2017b ; Mohammad and Nishida., 2008 ; Nikolaidis et al, 2017a )]. The studies that use “co-learning” tend to take a more symmetrical approach by looking at agent or robot learning as well as human learning, and pay more attention to the learning process and changing strategies of the human as well, often looking at many repetitions of a task ( Ramakrishnan, Zhang, and Shah 2017 ; C.-S. Lee et al, 2020 ; C. Lee et al, 2018 ; Shafti et al, 2020 ). Studies on co-evolution, on the other hand, monitor a long-term real-world application in which behavior of the human as well as the robot subtly changes over time ( Döppner, Derckx, and Schoder 2019 ).…”

Section: Co-learning: Background and Definitionmentioning

confidence: 99%

Becoming Team Members: Identifying Interaction Patterns of Mutual Adaptation for Human-Robot Co-Learning

2021

View full text Add to dashboard Cite

Becoming a well-functioning team requires continuous collaborative learning by all team members. This is called co-learning, conceptualized in this paper as comprising two alternating iterative stages: partners adapting their behavior to the task and to each other (co-adaptation), and partners sustaining successful behavior through communication. This paper focuses on the first stage in human-robot teams, aiming at a method for the identification of recurring behaviors that indicate co-learning. Studying this requires a task context that allows for behavioral adaptation to emerge from the interactions between human and robot. We address the requirements for conducting research into co-adaptation by a human-robot team, and designed a simplified computer simulation of an urban search and rescue task accordingly. A human participant and a virtual robot were instructed to discover how to collaboratively free victims from the rubbles of an earthquake. The virtual robot was designed to be able to real-time learn which actions best contributed to good team performance. The interactions between human participants and robots were recorded. The observations revealed patterns of interaction used by human and robot in order to adapt their behavior to the task and to one another. Results therefore show that our task environment enables us to study co-learning, and suggest that more participant adaptation improved robot learning and thus team level learning. The identified interaction patterns can emerge in similar task contexts, forming a first description and analysis method for co-learning. Moreover, the identification of interaction patterns support awareness among team members, providing the foundation for human-robot communication about the co-adaptation (i.e., the second stage of co-learning). Future research will focus on these human-robot communication processes for co-learning.

show abstract

Real-World Human-Robot Collaborative Reinforcement Learning

Cited by 13 publications

References 13 publications

Task dynamics define the contextual emergence of human corralling behaviors

Task dynamics define the contextual emergence of human corralling behaviors

A Comparison of Dynamical Perceptual-Motor Primitives and Deep Reinforcement Learning for Human-Artificial Agent Training Systems

Becoming Team Members: Identifying Interaction Patterns of Mutual Adaptation for Human-Robot Co-Learning

Contact Info

Product

Resources

About