CMS-RCNN: Contextual Multi-Scale Region-Based CNN for Unconstrained Face Detection

Zhu, Chenchen; Zheng, Yuejiu; Luu, Khoa; Savvides, Marios

doi:10.1007/978-3-319-61657-5_3

Cited by 213 publications

(204 citation statements)

References 37 publications

Supporting

Mentioning

199

Contrasting

Order By: Relevance

“…In order to validate the effectiveness of the proposed method, Faster RCNN [15,38] and CMS-RCNN [29] are applied to Sentinel-1 dataset. CMS-RCNN, which has the same resolution as conv5, fuses conv3, con4 and conv5 by down-sampling.…”

Section: Comparisons With Other Methodsmentioning

confidence: 99%

“…Since such an operation simply reverses the forward and backward Due to the respective merits that different layers possess, multiple layers fusion is a popular way to enhance the performance of detection in the current top-performance detector. As CMS-RCNN [29] did, the first way is to integrate down-sampled earlier layers with the last layer of the sharing CNN. Despite the fact that the feature map information is increased, small-sized objects still only cause responses on a tiny area in a fused feature map.…”

Section: Layer Up-sampling With Deconvolutionmentioning

confidence: 99%

“…The scale factor µ is able to be updated with the backpropagation and chain rule [29]. In this paper, a fixed scale factor, which makes the fused feature maps have the same mean level as the replaced layer in Faster RCNN, is adopted [28].…”

Section: Normalizationmentioning

confidence: 99%

“…For instance, an object located on land is highly unlikely to be considered a ship, while an object with bright intensity in the ocean area is prone to be affirmed as a positive object. In order to mimic the visual effect of a human being in a computer vision field, context information is always added into the deep neural network to recognize the small-sized objects [27,29,33].…”

Section: Integrating Contextual Informationmentioning

confidence: 99%

See 3 more Smart Citations

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

et al. 2017

View full text Add to dashboard Cite

Synthetic aperture radar (SAR) ship detection has been playing an increasingly essential role in marine monitoring in recent years. The lack of detailed information about ships in wide swath SAR imagery poses difficulty for traditional methods in exploring effective features for ship discrimination. Being capable of feature representation, deep neural networks have achieved dramatic progress in object detection recently. However, most of them suffer from the missing detection of small-sized targets, which means that few of them are able to be employed directly in SAR ship detection tasks. This paper discloses an elaborately designed deep hierarchical network, namely a contextual region-based convolutional neural network with multilayer fusion, for SAR ship detection, which is composed of a region proposal network (RPN) with high network resolution and an object detection network with contextual features. Instead of using low-resolution feature maps from a single layer for proposal generation in a RPN, the proposed method employs an intermediate layer combined with a downscaled shallow layer and an up-sampled deep layer to produce region proposals. In the object detection network, the region proposals are projected onto multiple layers with region of interest (ROI) pooling to extract the corresponding ROI features and contextual features around the ROI. After normalization and rescaling, they are subsequently concatenated into an integrated feature vector for final outputs. The proposed framework fuses the deep semantic and shallow high-resolution features, improving the detection performance for small-sized ships. The additional contextual features provide complementary information for classification and help to rule out false alarms. Experiments based on the Sentinel-1 dataset, which contains twenty-seven SAR images with 7986 labeled ships, verify that the proposed method achieves an excellent performance in SAR ship detection.

show abstract

Section: Comparisons With Other Methodsmentioning

confidence: 99%

Section: Layer Up-sampling With Deconvolutionmentioning

confidence: 99%

Section: Normalizationmentioning

confidence: 99%

Section: Integrating Contextual Informationmentioning

confidence: 99%

See 2 more Smart Citations

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

et al. 2017

View full text Add to dashboard Cite

show abstract

“…Zhang et al[194] proposed FDNet based on ResNet with larger deformable convolutional kernels to capture image context. Zhu et al[195] proposed a Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) in which multi-scale information was grouped both in region proposal and ROI detection to deal with faces at various range of scale. In addition, contextual information around faces is also considered in training detectors.…”

mentioning

confidence: 99%

Recent advances in deep learning for object detection

2020

View full text Add to dashboard Cite

Object detection is a fundamental visual recognition problem in computer vision and has been widely studied in the past decades.Visual object detection aims to find objects of certain target classes with precise localization in a given image and assign each object instance a corresponding class label. Due to the tremendous successes of deep learning based image classification, object detection techniques using deep learning have been actively studied in recent years. In this paper, we give a comprehensive survey of recent advances in visual object detection with deep learning. By reviewing a large body of recent related work in literature, we systematically analyze the existing object detection frameworks and organize the survey into three major parts: (i) detection components, (ii) learning strategies, and (iii) applications & benchmarks. In the survey, we cover a variety of factors affecting the detection performance in detail, such as detector architectures, feature learning, proposal generation, sampling strategies, etc.Finally, we discuss several future directions to facilitate and spur future research for visual object detection with deep learning.

show abstract

FaceOff: Anonymizing Videos in the Operating Rooms

Flouty

Zisimopoulos

Stoyanov

2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Video capture in the surgical operating room (OR) is increasingly possible and has potential for use with computer assisted interventions (CAI), surgical data science and within smart OR integration. Captured video innately carries sensitive information that should not be completely visible in order to preserve the patient's and the clinical teams' identities. When surgical video streams are stored on a server, the videos must be anonymized prior to storage if taken outside of the hospital. In this article, we describe how a deep learning model, Faster R-CNN, can be used for this purpose and help to anonymize video data captured in the OR. The model detects and blurs faces in an effort to preserve anonymity. After testing an existing face detection trained model, a new dataset tailored to the surgical environment, with faces obstructed by surgical masks and caps, was collected for fine-tuning to achieve higher face-detection rates in the OR. We also propose a temporal regularisation kernel to improve recall rates. The fine-tuned model achieves a face detection recall of 88.05% and 93.45 % before and after applying temporal-smoothing respectively.

show abstract

CMS-RCNN: Contextual Multi-Scale Region-Based CNN for Unconstrained Face Detection

Cited by 213 publications

References 37 publications

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection

Recent advances in deep learning for object detection

FaceOff: Anonymizing Videos in the Operating Rooms

Contact Info

Product

Resources

About