In foreground segmentation, it is challenging to construct an effective background model to learn the spatial-temporal representation of the background. Recently, deep learning-based background models (DBMs) with the capability of extracting high-level features have shown remarkable performance. However, the existing state-of-the-art DBMs deal with video segmentation as single-image segmentation and ignore temporal cues in video sequences. To exploit temporal data sufficiently, this paper proposes a multi-input multi-output (MIMO) DBM framework for the first time, which is partially inspired by the binocular summation effect in human eyes. Our framework is an X-shaped network which allows the DBM to track temporal changes in a video sequence. Moreover, each output branch of our model could receive visual signals from two similar input frames simultaneously like the binocular summation mechanism. In addition, our model can be trained end-to-end using only a few training examples without any postprocessing. We evaluate our method on the largest dataset for change detection (CDnet 2014) and achieve the state-of-the-art performance by an average overall F-Measure of 0.9920. INDEX TERMS Foreground segmentation, background subtraction, deep learning, focal loss, binocular summation.
Fiber-shaped supercapacitors have drawn much attention for their great potential application in future portable and wearable electronics because of their outstanding flexibility, tiny volume, and good deformability. In this work, commercial poly(ethylene terephthalate) (PET) thread was successfully converted into an electrically conductive and electrochemically active thread by introducing copper sulfide (CuS) and polyaniline (PANI) via simple chemical bath deposition and electrochemical deposition. The obtained PANI/CuS/PET electrode combined all the advantages of PET, CuS, and PANI, showing an excellent physical and electrochemical performance. The fiber-shaped supercapacitor exhibits a high specific capacitance of 29 mF cm −2 (116 mF cm −2 for a single electrode) and good cycling stability with 93.1% retention after 1000 cycles. With the simple preparation method and low-cost raw materials, this strategy provides a reference for the fabrication of portable/wearable energy storage devices.
Although person re-identification (ReID) has drawn increasing research attention due to its potential to address the problem of analysis and processing of massive monitoring data, it is very challenging to learn discriminative information when the people in the images are occluded, in large pose variations or from different perspectives. To address this problem, we propose a novel joint attention person ReID (JA-ReID) architecture. The idea is to learn two complementary feature representations by combining a soft pixel-level attention mechanism and a hard region-level attention mechanism. The soft pixel-level attention mechanism learns a discriminative embedding for the fine-grained information by exploring the salient parts in the feature maps. The hard region-level attention mechanism conducts uniform partitions on the convolutional feature maps for learning local features. We have achieved competitive results in three popular benchmarks, including Market1501, DukeMTMC-reID, and CUHK03. The experimental results verify the adaptability of the joint attention mechanism to non-rigid deformation of the human body, which can effectively improve the accuracy of ReID.
Recent researches on mobile robots show that convolutional neural network (CNN) has achieved impressive performance in visual place recognition especially for large-scale dynamic environment. However, CNN leads to the large space of image representation that cannot meet the real-time demand for robot navigation. Aiming at this problem, we evaluate the feature effectiveness of feature maps obtained from the layer of CNN by variance and propose a novel method that reserve salient feature maps and make adaptive binarization for them. Experimental results demonstrate the effectiveness and efficiency of our method. Compared with state of the art methods for visual place recognition, our method not only has no significant loss in precision, but also greatly reduces the space of image representation. key words: visual place recognition, CNN, variance, feature map, binarization Yutian Chen is master candidate in Army Engineering University of PLA. His main research fields are deep learning and image processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.