In order to meet the increasing demands of mobile service robot applications, a dedicated perception module is an essential requirement for the interaction with users in real-world scenarios. In particular, multi sensor fusion and human re-identification are recognized as active research fronts. Through this paper we contribute to the topic and present a modular detection and tracking system that models position and additional properties of persons in the surroundings of a mobile robot. The proposed system introduces a probability-based data association method that besides the position can incorporate face and color-based appearance features in order to realize a re-identification of persons when tracking gets interrupted. The system combines the results of various state-of-the-art image-based detection systems for person recognition, person identification and attribute estimation. This allows a stable estimate of a mobile robot’s user, even in complex, cluttered environments with long-lasting occlusions. In our benchmark, we introduce a new measure for tracking consistency and show the improvements when face and appearance-based re-identification are combined. The tracking system was applied in a real world application with a mobile rehabilitation assistant robot in a public hospital. The estimated states of persons are used for the user-centered navigation behaviors, e.g., guiding or approaching a person, but also for realizing a socially acceptable navigation in public environments.
Person re-identification plays a key role in applications where a mobile robot needs to track its users over a long period of time, even if they are partially unobserved for some time, in order to follow them or be available on demand. In this context, deep-learning based real-time feature extraction on a mobile robot is often performed on specialpurpose devices whose computational resources are shared for multiple tasks. Therefore, the inference speed has to be taken into account. In contrast, person re-identification is often improved by architectural changes that come at the cost of significantly slowing down inference. Attention blocks are one such example. We will show that some well-performing attention blocks used in the state of the art are subject to inference costs that are far too high to justify their use for mobile robotic applications. As a consequence, we propose an attention block that only slightly affects the inference speed while keeping up with much deeper networks or more complex attention blocks in terms of re-identification accuracy. We perform extensive neural architecture search to derive rules at which locations this attention block should be integrated into the architecture in order to achieve the best trade-off between speed and accuracy. Finally, we confirm that the best performing configuration on a re-identification benchmark also performs well on an indoor robotic dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.