Person-following technology is an important robot service. The major trend of person-following is to utilize computer vision technology to localize the target person, due to the wide view and rich information that is obtained from the real world through a camera. However, most existing approaches employ the detecting-by-tracking strategy, which suffers from low speed, accompanied with more complicated detecting models and unstable region of interest (ROI) outputs in unexpressed situations. In this paper, we propose a novel classification-lock strategy to localize the target person, which incorporates the visual tracking technology with object detection technology, to adapt the localization model to different environments online, and to keep a high frame-per-second (FPS) on the mobile platform. This person-following approach consists of three key parts. In the first step, a pairwise cluster tracker is employed to localize the person. A positive and negative classifier is then utilized to verify the tracker’s result and to update the tracking model. In addition, a detector pre-trained by a CPU-optimized convolutional neural network is used to further improve the result of tracking. In the experiment, our approach is compared with other state-of-art approaches by a Vojir tracking dataset, with three sequences in the items of human to prove the quality of person localization. Moreover, the common challenges during the following task are evaluated by several image sequences in a static scene, and a dynamic scene is used to evaluate the improvement from the classification-lock strategy. Finally, our approach is deployed on a mobile robot to test its performance on the function of the person-following. Compared with other state-of-art methods, our approach achieves the highest score (0.91 recall rate). In the static and dynamic scene, the output of the ROI based on the classification-lock strategy is significantly better than that without it. Our approach also succeeds in a long-term following task in an indoor multi-floor scenario.
This paper proposes a method that minimizes the consumed energy by searching the optimal locations of the mass centers of the biped robot's links using Genetic Algorithm. This paper presents a learning controller for repetitive gait control of the biped walking robot. The learning control scheme consists of a feedforward learning rule and linear feedback control input for stabilization of learning system. The feasibility of learning control to the biped robot's motion is shown via computer simulation and experimental results with 24 DOF biped walking robot.
Simultaneous localization and mapping (SLAM) is an important function for service robots to self-navigate modernized buildings. However, only a few existing applications allow them to automatically move between stories through elevator. Some approaches have accomplished with the aid of hardware; however, this study shows that computer vision can be a promising alternative for button localization. In this paper, we proposed a real-time multi-story SLAM system which overcomes the problem of detecting elevator buttons using a localization framework that combines tracking and detecting approaches. A two-stage deep neural network initially locates the original positions of the target buttons, and a part-based tracker follows the target buttons in real-time. A positive-negative classifier and deep learning neural network (particular for button shape detection) modify the tracker's output in every frame. To allow the robot to self-navigate, a 2D grid mapping approach was used for the localization and mapping. Then, when the robot navigates a floor, the A * algorithm generates the shortest path. In the experiment, two dynamic scenes (which include common elevator button localization challenges) were used to evaluate the efficiency of our approach, and compared it with other state-of-the-art methods. Our approach was also tested on a prototype robot system to assesses how well it can navigate a multi-story building. The results show that our method could overcome the common background challenges that occur inside an elevator, and in doing so, it enables the mobile robot to autonomously navigate a multi-story building.
Abstract. Semantic segmentation is a fundamental research task in computer vision, which intends to assign a certain category to every pixel. Currently, most existing methods only utilize the deepest feature map for decoding, while high-level features get inevitably lost during the procedure of down-sampling. In the decoder section, transposed convolution or bilinear interpolation was widely used to restore the size of the encoded feature map; however, few optimizations are applied during up-sampling process which is detrimental to the performance for grouping and classification. In this work, we proposed a dual pyramids encoder-decoder deep neural network (DPEDNet) to tackle the above issues. The first pyramid integrated and encoded multi-resolution features through sequentially stacked merging, and the second pyramid decoded the features through dense atrous convolution with chained up-sampling. Without post-processing and multi-scale testing, the proposed network has achieved state-of-the-art performances on two challenging benchmark image datasets for both ground and aerial view scenes.
서 론이동 AbstractThis paper reports a localization method of a mobile robot using ceiling image. The ceiling has landmarks which are not distinguishablefrom one another. The location of every landmark in a map is given a priori while correspondence is not given between a detected landmark and a landmark in the map. Only the initial pose of the robot relative to the landmarks is given. The method uses particle filter approach for localization. Along with estimating robot pose, the method also associates a landmark in the map to a landmark detected from the ceiling image. The method is tested in an indoor environment which has circular landmarks on the ceiling. The test verifies the feasibility of the method in an environment where range data to walls or to beacons are not available or severely corrupted with noise. This method is useful for localization in a warehouse where measurement by Laser range finder and range data to beacons of RF or ultrasonic signal have large uncertainty.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.