Towards Integrating Real-Time Crowd Advice with Reinforcement Learning

Cruz, Gabriel; Peng, Bei; Lasecki, Walter S.; Taylor, Matthew E.

doi:10.1145/2732158.2732180

Cited by 8 publications

(2 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While Legion provided generalized user interface control, Salisbury et al (Salisbury, Stein, and Ramchurn 2015b) introduced additional control mediators that improved performance by focusing specifically on robotics applications. de la Cruz et al (de la Cruz et al 2015) got feedback from a crowd of workers in ∼0.3 seconds for mistakes made by an automated agent. Chung et al (Chung et al 2014) explored learning from initial demonstrations using crowd feedback for motion planning problems.…”

Section: Robotics and Autonomous Controlmentioning

confidence: 99%

EURECA: Enhanced Understanding of Real Environments via Crowd Assistance

Gouravajhala

Yim

Desingh

et al. 2018

HCOMP

View full text Add to dashboard Cite

Indoor robots hold the promise of automatically handling mundane daily tasks, helping to improve access for people with disabilities, and providing on-demand access to remote physical environments. Unfortunately, the ability to understand never-before-seen objects in scenes where new items may be added (e.g., purchased) or altered (e.g., damaged) on a regular basis remains an open challenge for robotics. In this paper, we introduce EURECA, a mixed-initiative system that leverages online crowds of human contributors to help robots robustly identify 3D point cloud segments corresponding to user-referenced objects in near real-time. EURECA allows robots to understand multi-object 3D scenes on-the-fly (in ~40 seconds) by providing groups of non-expert crowd workers with intelligent tools that can segment objects more quickly (~70% faster) and more accurately (6% higher F1 score) than individuals. More broadly, EURECA introduces the first real-time crowdsourcing tool that addresses the challenge of learning about new objects in real-world settings, creating a new source of data for training robots online, as well as a platform for studying mixed-initiative crowdsourcing workflows for understanding 3D scenes.

show abstract

Section: Robotics and Autonomous Controlmentioning

confidence: 99%

EURECA: Enhanced Understanding of Real Environments via Crowd Assistance

Gouravajhala

Yim

Desingh

et al. 2018

HCOMP

View full text Add to dashboard Cite

show abstract

“…Practically, a weight could be computed for each human which indicates the competence level of that human at the task, and all trajectories supplied by that human would be weighted accordingly. This approach would be useful for example in a crowdsourcing framework [22].…”

Section: E Human Elicited Priorsmentioning

confidence: 99%

Action Priors for Learning Domain Invariances

Rosman

Ramamoorthy

2015

IEEE Trans. Auton. Mental Dev.

View full text Add to dashboard Cite

An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioral invariances in the domain, by identifying actions to be prioritized in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of lookahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalizability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.

show abstract