A deep learning architecture is proposed to predict graspable locations for robotic manipulation. It considers situations where no, one, or multiple object(s) are seen. By defining the learning problem to be classification with null hypothesis competition instead of regression, the deep neural network with RGB-D image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The method outperforms state-of-the-art approaches on the Cornell dataset with 96.0% and 96.1% accuracy on image-wise and object-wise splits, respectively. Evaluation on a multi-object dataset illustrates the generalization capability of the architecture. Grasping experiments achieve 96.0% grasp localization and 89.0% grasping success rates on a test set of household objects. The real-time process takes less than .25 s from image to plan.
Mobile sensing is an emerging technology that utilizes agent-participatory data for decision making or state estimation, including multimedia applications. This article investigates the structure of mobile sensing schemes and introduces crowdsourcing methods for mobile sensing. Inspired by social network, one can establish trust among participatory agents to leverage the wisdom of crowds for mobile sensing. A prototype of social network inspired mobile multimedia and sensing application is presented for illustrative purpose. Numerical experiments on real-world datasets show improved performance of mobile sensing via crowdsourcing. Challenges for mobile sensing with respect to Internet layers are discussed.
A human-in-the-loop system is proposed to enable collaborative manipulation tasks for person with physical disabilities. Studies show that the cognitive burden of subject reduces with increased autonomy of assistive system. Our framework obtains high-level intent from the user to specify manipulation tasks. The system processes sensor input to interpret the user's environment. Augmented reality glasses provide ego-centric visual feedback of the interpretation and summarize robot affordances on a menu. A tongue drive system serves as the input modality for triggering a robotic arm to execute the tasks. Assistance experiments compare the system to Cartesian control and to state-of-the-art approaches. Our system achieves competitive results with faster completion time by simplifying manipulation tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.