While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, itis important to note that the human visual cortex generally contains more feedback than feedforward connections. In this paper, we will briefly introduce the background of feedbacks in the human visual cortex, which motivates us to develop a computational feedback mechanism in deep neural networks. In addition to the feedforward inference in traditional neural networks, a feedback loop is introduced to infer the activation status of hidden layer neurons according to the "goal" of the network, e.g., high-level semantic labels. We analogize this mechanism as "Look and Think Twice." The feedback networks help better visualize and understand how deep neural networks work, and capture visual attention on expected objects, even in images with cluttered background and multiple objects. Experiments on ImageNet dataset demonstrate its effectiveness in solving tasks such as image classification and object localization.
Feedback is a fundamental mechanism existing in the human visual system, but has not been explored deeply in designing computer vision algorithms. In this paper, we claim that feedback plays a critical role in understanding convolutional neural networks (CNNs), e.g., how a neuron in CNN describes an object's pattern, and how a collection of neurons form comprehensive perception to an object. To model the feedback in CNNs, we propose a novel model named Feedback CNN and develop two new processing algorithms, i.e., neural pathway pruning and pattern recovering. We have mathematically proven that the proposed method can reach local optimum. Note that Feedback CNN belongs to weakly supervised methods and can be trained only using category-level labels. But it possesses powerful capability to accurately localize and segment category-specific objects. We conduct extensive visualization analysis, and the results reveal the close relationship between neurons and object parts in Feedback CNN. Finally, we evaluate the proposed Feedback CNN over the tasks of weakly supervised object localization and segmentation, and the experimental results on ImageNet and Pascal VOC show that our method remarkably outperforms the state-of-the-art ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.