Daehyung Park scite author profile

The detection of anomalous executions is valuable for reducing potential hazards in assistive manipulation. Multimodal sensory signals can be helpful for detecting a wide range of anomalies. However, the fusion of high-dimensional and heterogeneous modalities is a challenging problem. We introduce a long short-term memory based variational autoencoder (LSTM-VAE) that fuses signals and reconstructs their expected distribution. We also introduce an LSTM-VAE-based detector using a reconstruction-based anomaly score and a state-based threshold. For evaluations with 1,555 robot-assisted feeding executions including 12 representative types of anomalies, our detector had a higher area under the receiver operating characteristic curve (AUC) of 0.8710 than 5 other baseline detectors from the literature. We also show the multimodal fusion through the LSTM-VAE is effective by comparing our detector with 17 raw sensory signals versus 4 hand-engineered features.

show abstract

Multimodal execution monitoring for anomaly detection during robot manipulation

Park

Erickson

Bhattacharjee

et al. 2016

View full text Add to dashboard Cite

Online detection of anomalous execution can be valuable for robot manipulation, enabling robots to operate more safely, determine when a behavior is inappropriate, and otherwise exhibit more common sense. By using multiple complementary sensory modalities, robots could potentially detect a wider variety of anomalies, such as anomalous contact or a loud utterance by a human. However, task variability and the potential for false positives make online anomaly detection challenging, especially for long-duration manipulation behaviors. In this paper, we provide evidence for the value of multimodal execution monitoring and the use of a detection threshold that varies based on the progress of execution. Using a data-driven approach, we train an execution monitor that runs in parallel to a manipulation behavior. Like previous methods for anomaly detection, our method trains a hidden Markov model (HMM) using multimodal observations from non-anomalous executions. In contrast to prior work, our system also uses a detection threshold that changes based on the execution progress. We evaluated our approach with haptic, visual, auditory, and kinematic sensing during a variety of manipulation tasks performed by a PR2 robot. The tasks included pushing doors closed, operating switches, and assisting ablebodied participants with eating yogurt. In our evaluations, our anomaly detection method performed substantially better with multimodal monitoring than single modality monitoring. It also resulted in more desirable ROC curves when compared with other detection threshold methods from the literature, obtaining higher true positive rates for comparable false positive rates.

show abstract

A multimodal execution monitor with anomaly classification for robot-assisted feeding

Park

Kim

Hoshi

et al. 2017

View full text Add to dashboard Cite

Multimodal anomaly detection for assistive robots

2018

View full text Add to dashboard Cite

Active robot-assisted feeding with a general-purpose mobile manipulator: Design, evaluation, and lessons learned

Park

Hoshi

Mahajan

et al. 2020

Robotics and Autonomous Systems

View full text Add to dashboard Cite

Eating is an essential activity of daily living (ADL) for staying healthy and living at home independently. Although numerous assistive devices have been introduced, many people with disabilities are still restricted from independent eating due to the devices' physical or perceptual limitations. In this work, we present a new meal-assistance system and evaluations of this system with people with motor impairments. We also discuss learned lessons and design insights based on the evaluations. The meal-assistance system uses a general-purpose mobile manipulator, a Willow Garage PR2, which has the potential to serve as a versatile form of assistive technology. Our active feeding framework enables the robot to autonomously deliver food to the user's mouth, reducing the need for head movement by the user. The user interface, visually-guided behaviors, and safety tools allow people with severe motor impairments to successfully use the system. We evaluated our system with a total of 10 able-bodied participants and 9 participants with motor impairments. Both groups of participants successfully ate various foods using the system and reported high rates of success for the system's autonomous behaviors. In general, participants who operated the system reported that it was comfortable, safe, and easy-to-use.

show abstract

Multimodal estimation and communication of latent semantic knowledge for robust execution of robot instructions

Arkin

Park

Roy

et al. 2020

The International Journal of Robotics Research

View full text Add to dashboard Cite

The goal of this article is to enable robots to perform robust task execution following human instructions in partially observable environments. A robot’s ability to interpret and execute commands is fundamentally tied to its semantic world knowledge. Commonly, robots use exteroceptive sensors, such as cameras or LiDAR, to detect entities in the workspace and infer their visual properties and spatial relationships. However, semantic world properties are often visually imperceptible. We posit the use of non-exteroceptive modalities including physical proprioception, factual descriptions, and domain knowledge as mechanisms for inferring semantic properties of objects. We introduce a probabilistic model that fuses linguistic knowledge with visual and haptic observations into a cumulative belief over latent world attributes to infer the meaning of instructions and execute the instructed tasks in a manner robust to erroneous, noisy, or contradictory evidence. In addition, we provide a method that allows the robot to communicate knowledge dissonance back to the human as a means of correcting errors in the operator’s world model. Finally, we propose an efficient framework that anticipates possible linguistic interactions and infers the associated groundings for the current world state, thereby bootstrapping both language understanding and generation. We present experiments on manipulators for tasks that require inference over partially observed semantic properties, and evaluate our framework’s ability to exploit expressed information and knowledge bases to facilitate convergence, and generate statements to correct declared facts that were observed to be inconsistent with the robot’s estimate of object properties.

show abstract

3D Human Pose Estimation on a Configurable Bed from a Pressure Image

Clever

Kapusta

Park

et al. 2018

View full text Add to dashboard Cite

Robots have the potential to assist people in bed, such as in healthcare settings, yet bedding materials like sheets and blankets can make observation of the human body difficult for robots. A pressure-sensing mat on a bed can provide pressure images that are relatively insensitive to bedding materials. However, prior work on estimating human pose from pressure images has been restricted to 2D pose estimates and flat beds. In this work, we present two convolutional neural networks to estimate the 3D joint positions of a person in a configurable bed from a single pressure image. The first network directly outputs 3D joint positions, while the second outputs a kinematic model that includes estimated joint angles and limb lengths. We evaluated our networks on data from 17 human participants with two bed configurations: supine and seated. Our networks achieved a mean joint position error of 77 mm when tested with data from people outside the training set, outperforming several baselines. We also present a simple mechanical model that provides insight into ambiguity associated with limbs raised off of the pressure mat, and demonstrate that Monte Carlo dropout can be used to estimate pose confidence in these situations. Finally, we provide a demonstration in which a mobile manipulator uses our network's estimated kinematic model to reach a location on a person's body in spite of the person being seated in a bed and covered by a blanket.

show abstract

Leveraging Past References for Robust Language Grounding

Roy

Noseworthy

Paul

et al. 2019

View full text Add to dashboard Cite

Grounding referring expressions to objects in an environment has traditionally been considered a one-off, ahistorical task. However, in realistic applications of grounding, multiple users will repeatedly refer to the same set of objects. As a result, past referring expressions for objects can provide strong signals for grounding subsequent referring expressions. We therefore reframe the grounding problem from the perspective of coreference detection and propose a neural network that detects when two expressions are referring to the same object. The network combines information from vision and past referring expressions to resolve which object is being referred to. Our experiments show that detecting referring expression coreference is an effective way to ground objects described by subtle visual properties, which standard visual grounding models have difficulty capturing. We also show the ability to detect object coreference allows the grounding model to perform well even when it encounters object categories not seen in the training data. * Equal contribution. Utterances from a User: (1) The open text book. (2) Page with a dark border and images of two people at the bottom. (3) The open book with typed pages. (4) Pages of handwritten notes under the open book.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daehyung Park

A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-Based Variational Autoencoder

Multimodal execution monitoring for anomaly detection during robot manipulation

A multimodal execution monitor with anomaly classification for robot-assisted feeding

Multimodal anomaly detection for assistive robots

Active robot-assisted feeding with a general-purpose mobile manipulator: Design, evaluation, and lessons learned

Multimodal estimation and communication of latent semantic knowledge for robust execution of robot instructions

3D Human Pose Estimation on a Configurable Bed from a Pressure Image

Leveraging Past References for Robust Language Grounding

Contact Info

Product

Resources

About