Figure 1. Large-scale map reconstructed online by SkiMap++ through a mobile robot equipped with an head-mounted RGB-D camera. Purple spheres represent areas found alongside with reconstruction which are likely to contain object instances. Magnified circles represent outcomes of the final Instance Estimation Algorithm, which is performed in the aforementioned areas only. The whole map is acquired by relying on the robot's own odometry in order to track camera poses over time.
AbstractWe introduce SkiMap++, an extension to the recently proposed SkiMap mapping framework for robot navigation [1]. The extension deals with enriching the map with semantic information concerning the presence in the environment of certain objects that may be usefully recognized by the robot, e.g. for the sake of grasping them. More precisely, the map can accommodate information about the spatial locations of certain 3D object features, as determined by matching the visual features extracted from the incoming frames through a random forest learned off-line from a set of object models. Thereby, evidence about the presence of object features is gathered from multiple vantage points alongside with the standard geometric mapping task, so to enable recognizing the objects and estimating their 6 DOF poses. As a result, SkiMap++ can reconstruct the geometry of large scale environments as well as localize some relevant objects therein (Fig.1) in real-time on CPU. As an additional contribution, we present an RGB-D dataset featuring ground-truth camera and object poses, which may be deployed by researchers interested in pursuing SLAM alongside with object recognition, a topic often referred to as Semantic SLAM. 1