Accurately modeling the crowd's head scale variations is an effective way to improve the counting accuracy of the crowd counting methods. Most counting networks apply a multibranch network structure to obtain different scales of head features. Although they have achieved promising results, they do not perform very well on the extreme scale variation scene due to the limited scale representability. Meanwhile, these methods are prone to recognize background objects as foreground crowds in complex scenes due to the limited context and high-level semantic information. We propose a compositional multi-scale feature enhanced learning approach (COMAL) for crowd counting to handle the above limitations. COMAL enhances the multi-scale feature representations from three aspects: (1) The semantic enhanced module (SEM) is developed for embedding the high-level semantic information to the multi-scale features; (2) The diversity enhanced module (DEM) is proposed to enrich the variety of crowd features' different scales; (3) The context enhanced module (CEM) is designed for strengthening the multi-scale features with more context information. Based on the proposed COMAL, we develop a crowd counting network under the encoder-decoder framework and perform extensive experiments on ShanghaiTech, UCF CC 50, and UCF-QNRF datasets. Qualitative and quantitive results demonstrate the effectiveness of the proposed COMAL.
Effective searching for target objects in indoor scenes is essential for household robots to perform daily tasks. With the establishment of a precise map, the robot can navigate to a fixed static target. However, it is difficult for mobile robots to find movable objects like cups. To address this problem, we establish an object search framework that combines navigation map, semantic map, and scene graph. The robot updates the scene graph to achieve a long-term target search. Considering the different start positions of the robots, we weigh the distance the robot walks and the probability of finding objects to achieve global path planning. The robot can continuously update the scene graph in a dynamic environment to memorize the position relation of objects in the scene. This method has been realized in both simulation and real-world environments. The experimental results show the feasibility and effectiveness of this method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.