Comprehension of spoken natural language is an essential skill for robots to communicate with humans effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures and the wide variety of expressions used in spoken language, and (2) inherent ambiguity of human instructions. In this paper, we propose the first comprehensive system for controlling robots with unconstrained spoken language, which is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and show how higher success rates of the object picking task can be achieved through an interactive clarification process. 1
Abstract. This paper describes Team Delft's robot, which won the Amazon Picking Challenge 2016, including both the Picking and the Stowing competitions. The goal of the challenge is to automate pick and place operations in unstructured environments, specifically the shelves in an Amazon warehouse. Team Delft's robot is based on an industrial robot arm, 3D cameras and a customized gripper. The robot's software uses ROS to integrate off-the-shelf components and modules developed specifically for the competition, implementing Deep Learning and other AI techniques for object recognition and pose estimation, grasp planning and motion planning. This paper describes the main components in the system, and discusses its performance and results at the Amazon Picking Challenge 2016 finals.
Computational RFID (CRFID) devices are emerging platforms that can enable perennial computation and sensing by eliminating the need for batteries. Although much research has been devoted to improving upstream (CRFID to RFID reader) communication rates, the opposite direction has so far been neglected, presumably due to the difficulty of guaranteeing fast and error-free transfer amidst frequent power interruptions of CRFID. With growing interest in the market where CRFIDs are forever-embedded in many structures, it is necessary for this void to be filled. Therefore, we propose Wisent-a robust downstream communication protocol for CRFIDs that operates on top of the legacy UHF RFID communication protocol: EPC C1G2. The novelty of Wisent is its ability to adaptively change the frame length sent by the reader, based on the length throttling mechanism, to minimize the transfer times at varying channel conditions. We present an implementation of Wisent for the WISP 5 and an off-the-shelf RFID reader. Our experiments show that Wisent allows transfer up to 16 times faster than a baseline, non-adaptive shortest frame case, i.e. single word length, at sub-meter distance. As a case study, we show how Wisent enables wireless CRFID reprogramming, demonstrating the world's first wirelessly reprogrammable (software defined) CRFID.c 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Estimation of tactile properties from vision, such as slipperiness or roughness, is important to effectively interact with the environment. These tactile properties help us decide which actions we should choose and how to perform them. E.g., we can drive slower if we see that we have bad traction or grasp tighter if an item looks slippery. We believe that this ability also helps robots to enhance their understanding of the environment, and thus enables them to tailor their actions to the situation at hand. We therefore propose a model to estimate the degree of tactile properties from visual perception alone (e.g., the level of slipperiness or roughness). Our method extends a encoderdecoder network, in which the latent variables are visual and tactile features. In contrast to previous works, our method does not require manual labeling, but only RGB images and the corresponding tactile sensor data. All our data is collected with a webcam and uSkin tactile sensor mounted on the end-effector of a Sawyer robot, which strokes the surfaces of 25 different materials. 1 We show that our model generalizes to materials
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.