Neural network based reinforcement learning for audio–visual gaze control in human–robot interaction

Lathuilière, Stéphane; Mâsse, Benoı̂t; Mesejo, Pablo; Horaud, Radu

doi:10.1016/j.patrec.2018.05.023

Cited by 49 publications

(58 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Moreover, the availability of suitable datasets would ease future research on this topic. In parallel, we wish to use this framework in the future as a tool to improve the decision process of a robotic system in a social context such as [5].…”

Section: Resultsmentioning

confidence: 99%

Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

Mâsse¹,

Lathuilière²,

Mesejo³

et al. 2019

2019 14th IEEE International Conference on Automatic Face &Amp; Gesture Recognition (FG 2019)

Self Cite

View full text Add to dashboard Cite

In this paper we address the problems of detecting objects of interest in a video and of estimating their locations, solely from the gaze directions of people present in the video. Objects can be indistinctly located inside or outside the camera field of view. We refer to this problem as extended gaze following. The contributions of the paper are the followings. First, we propose a novel spatial representation of the gaze directions adopting a top-view perspective. Second, we develop several convolutional encoder/decoder networks to predict object locations and compare them with heuristics and with classical learning-based approaches. Third, in order to train the proposed models, we generate a very large number of synthetic scenarios employing a probabilistic formulation. Finally, our methodology is empirically validated using a publicly available dataset.

show abstract

Section: Resultsmentioning

confidence: 99%

Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

Mâsse¹,

Lathuilière²,

Mesejo³

et al. 2019

2019 14th IEEE International Conference on Automatic Face &Amp; Gesture Recognition (FG 2019)

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, we want to move away from feature engineering and formulate our human-robot interaction scenario as a deep reinforcement learning problem. Recent studies in HRI showed impressive results in employing deep reinforcement learning for various applications [14,15,12]. The main challenge for deep learning approaches is the lack of training data from human studies but we plan to tackle this problem using our current Bayesian-based model to simulate human behaviour data as a prior for the deep reinforcement learning model.…”

Section: Resultsmentioning

confidence: 99%

Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

Sibirtseva

Ghadirzadeh

Leite

et al. 2019

Virtual, Augmented and Mixed Reality. Applications and Case Studies

View full text Add to dashboard Cite

In collaborative tasks, people rely both on verbal and nonverbal cues simultaneously to communicate with each other. For humanrobot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate referring expressions. In this work, we propose a model that can disambiguate multimodal fetching requests using modalities such as head movements, hand gestures, and speech. We analysed the acquired data from mixed reality experiments and formulated a hypothesis that modelling temporal dependencies of events in these three modalities increases the model's predictive power. We evaluated our model on a Bayesian framework to interpret referring expressions with and without exploiting the temporal prior.

show abstract

“…In deep networks, the selection of different hyper-parameters affects the accuracy of the algorithm [118]. This also applies to DRL, Lathuilière et al [86] presented several experiments to evaluate the impact of some of the principal parameters of their deep network structure. Thus far, model-free RL learning a value function or a policy through trial and error is the most commonly used approach in social robotics.…”

Section: Future Outlookmentioning

confidence: 99%

Reinforcement Learning Approaches in Social Robotics

Akalın

Loutfi

2021

Sensors

View full text Add to dashboard Cite

This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field.

show abstract

Neural network based reinforcement learning for audio–visual gaze control in human–robot interaction

Cited by 49 publications

References 28 publications

Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality

Reinforcement Learning Approaches in Social Robotics

Contact Info

Product

Resources

About