We present the design, integration, and evaluation of a full-stack robotic system called RoMan, which can conduct autonomous field operations involving physical interaction with its environment. RoMan offers autonomous behaviors that can be triggered from succinct, high-level human input such as “open this box and retrieve the bag inside.” The robot’s behaviors are driven by a set of planners and controllers grounded in perceptual reconstructions of the environment. These behaviors are articulated by a behavior tree that translates high-level operator input into programs of increasing sensorimotor expressiveness, ultimately driving the lowest-level controllers. The software system is implemented in ROS as a set of independent processes connected by synchronous and asynchronous communication, and distributed across two on-board planning/control computers. The behavior stack drives a novel platform consisting of a pair of custom, 500 Nm/axis manipulators mounted on a rotatable torso aboard a tracked platform. The robot’s head is equipped with forward-looking depth cameras, and the arms carry wrist-mounted force-torque sensors and a mix of three- and four-finger grippers. We discuss design and implementation trade-offs affecting the entire hardware-software stack and high-level manipulation behaviors. We also demonstrate the applicability of the system for solving two manipulation tasks: 1) removing heavy debris from a roadway, where 64% of end-to-end autonomous runs required at most one human intervention; and 2) retrieving an item from a closed container, with a fully autonomous success rate of 56%. Finally, we indicate lessons learned and suggest outstanding research problems.
For humans and robots to collaborate effectively as teammates in unstructured environments, robots must be able to construct semantically rich models of the environment, communicate efficiently with teammates, and perform sequences of tasks robustly with minimal human intervention, as direct human guidance may be infrequent and/or intermittent. Contemporary architectures for human-robot interaction often rely on engineered human-interface devices or structured languages that require extensive prior training and inherently limit the kinds of information that humans and robots can communicate. Natural language, particularly when situated with a visual representation of the robot’s environment, allows humans and robots to exchange information about abstract goals, specific actions, and/or properties of the environment quickly and effectively. In addition, it serves as a mechanism to resolve inconsistencies in the mental models of the environment across the human-robot team. This article details a novel intelligence architecture that exploits a centralized representation of the environment to perform complex tasks in unstructured environments. The centralized environment model is informed by a visual perception pipeline, declarative knowledge, deliberate interactive estimation, and a multimodal interface. The language pipeline also exploits proactive symbol grounding to resolve uncertainty in ambiguous statements through inverse semantics. A series of experiments on three different, unmanned ground vehicles demonstrates the utility of this architecture through its robust ability to perform language-guided spatial navigation, mobile manipulation, and bidirectional communication with human operators. Experimental results give examples of component-level behaviors and overall system performance that guide a discussion on observed performance and opportunities for future innovation.
Over the past decade, robotics technologies and the tools used to develop them have undergone significant advancement and transformation. In this paper, we observe and assess these changes from the perspective of a 10-year research program sponsored by the DEVCOM Army Research Laboratory, named the Robotics Collaborative Technology Alliance. Beyond advancing the state of the art by conducting research at some of the top academic institutions across the United States, the alliance also worked with top government and industry partners to integrate the research into meaningful experiments and demonstrations with military relevance. This paper assesses and provides insight into the effectiveness of the collaboration tools used by the team, management methods, data collection efforts, and live and virtual experiments. Ultimately, we seek to inform future efforts requiring disparate and distant teams of the potential advantages and challenges of using such tools by providing our lessons learned for how most effectively to work as a team of teams for advancing robotics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.