Convolutional neural networks (CNNs) have demonstrated their superiority in numerous computer vision tasks, yet their computational cost results prohibitive for many real-time applications such as pedestrian detection which is usually performed on low-consumption hardware. In order to alleviate this drawback, most strategies focus on using a two-stage cascade approach. Essentially, in the first stage a fast method generates a significant but reduced amount of high quality proposals that later, in the second stage, are evaluated by the CNN. In this work, we propose a novel detection pipeline that further benefits from the two-stage cascade strategy. More concretely, the enriched and subsequently compressed features used in the first stage are reused as the CNN input. As a consequence, a simpler network architecture, adapted for such small input sizes, allows to achieve real-time performance and obtain results close to the state-of-the-art while running significantly faster without the use of GPU. In particular, considering that the proposed pipeline runs in frame rate, the achieved performance is highly competitive. We furthermore demonstrate that the proposed pipeline on itself can serve as an effective proposal generator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.