ZoomCount: A Zooming Mechanism for Crowd Counting in Static Images

Sajid, Usman; Sajid, Hasan; Wang, Hongcheng; Wang, Guanghui

doi:10.1109/tcsvt.2020.2978717

Cited by 47 publications

(27 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Bai et al [52] self-corrected the density map by EM algorithm. ZoomCount [28] proposed a zooming mechanism to tackle the underestimation and overestimation issues due to the density variation problem. Adversarial networks [53], [54], [55] are also used for crowd counting to generate a high-quality density map.…”

Section: B Cnn-based Methodsmentioning

confidence: 99%

“…Benefiting from the strong representation learning ability of convolutional neural networks (CNN) [10], [11], [12], [13], [14], CNN-based methods [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31] are employed to predict a density map of a still image because the density map contains more spatial information of people distribution and its integral equals the number of people in one image. For example, multi-branch architectures [16], [18], [17], [19] are designed to extract the multi-scale features and detect varying sizes of heads because different-sized convolutional filters have varying receptive fields, which are more useful for learning non-uniform crowd distribution.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

S$^2$FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

Gao¹,

Huang²,

Lei³

et al. 2022

Preprint

View full text Add to dashboard Cite

Most conventional crowd counting methods utilize a fully-supervised learning framework to learn a mapping between scene images and crowd density maps. Under the circumstances of such fully-supervised training settings, a large quantity of expensive and time-consuming pixel-level annotations are required to generate density maps as the supervision. One way to reduce costly labeling is to exploit self-structural information and innerrelations among unlabeled images. Unlike the previous methods utilizing these relations and structural information from the original image level, we explore such self-relations from the latent feature spaces because it can extract more abundant relations and structural information. Specifically, we propose S 2 FPR which can extract structural information and learn partial orders of coarse-to-fine pyramid features in the latent space for better crowd counting with massive unlabeled images. In addition, we collect a new unlabeled crowd counting dataset (FUDAN-UCC) with 4,000 images in total for training. One by-product is that our proposed S 2 FPR method can leverage numerous partial orders in the latent space among unlabeled images to strengthen the model representation capability and reduce the estimation errors for the crowd counting task. Extensive experiments on four benchmark datasets, i.e. the UCF-QNRF, the ShanghaiTech PartA and PartB, and the UCF-CC-50, show the effectiveness of our method compared with previous semisupervised methods. The source code and dataset are available at https://github.com/bridgeqiqi/S2FPR.

show abstract

Section: B Cnn-based Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

S$^2$FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

Gao¹,

Huang²,

Lei³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Deep learning models : Inspired by the success of AlexNet [ 16 ] in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012, convolutional neural networks (CNN) have attracted a lot of attention and been successfully applied to image classification [ 20 – 22 ], object detection [ 4 , 23 , 24 ], depth estimation [ 25 , 26 ], image transformation [ 27 , 28 ], and crowd counting [ 29 ]citesajid2020plug. VGGNets [ 14 ], and GoogleNet [ 17 ], the ILSVRC winners of 2014 and 2015, proved that deeper models could significantly increase the ability of representations.…”

Section: Introductionmentioning

confidence: 99%

A comparative study on polyp classification using convolutional neural networks

Patel

Tao

et al. 2020

PLoS ONE

Self Cite

View full text Add to dashboard Cite

Colorectal cancer is the third most common cancer diagnosed in both men and women in the United States. Most colorectal cancers start as a growth on the inner lining of the colon or rectum, called 'polyp'. Not all polyps are cancerous, but some can develop into cancer. Early detection and recognition of the type of polyps is critical to prevent cancer and change outcomes. However, visual classification of polyps is challenging due to varying illumination conditions of endoscopy, variant texture, appearance, and overlapping morphology between polyps. More importantly, evaluation of polyp patterns by gastroenterologists is subjective leading to a poor agreement among observers. Deep convolutional neural networks have proven very successful in object classification across various object categories. In this work, we compare the performance of the state-of-the-art general object classification models for polyp classification. We trained a total of six CNN models end-to-end using a dataset of 157 video sequences composed of two types of polyps: hyperplastic and adenomatous. Our results demonstrate that the state-of-the-art CNN models can successfully classify polyps with an accuracy comparable or better than reported among gastroenterologists. The results of this study can guide future research in polyp classification.

show abstract

“…D EEP networks have been dramatically driving the progress of computer vision, bringing out a series of popular models for different vision tasks [35] [31], like image classification [3] [29], object detection [32] [15], crowd counting [25], depth estimation [10], and image translation [30]. Object detection plays an important role and serves as a prerequisite for numerous computer vision applications, such as instance segmentation, face recognition, autonomous driving, and video analysis [1], [9], [11], [12], [21].…”

Section: Introductionmentioning

confidence: 99%

Location-Aware Box Reasoning for Anchor-Based Single-Shot Object Detection

Wang

2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

In the majority of object detection frameworks, the confidence of instance classification is used as the quality criterion of predicted bounding boxes, like the confidence-based ranking in nonmaximum suppression (NMS). However, the quality of bounding boxes, indicating the spatial relations, is not only correlated with the classification scores. Compared with the region proposal network (RPN) based detectors, single-shot object detectors suffer the box quality as there is a lack of pre-selection of box proposals. In this paper, we aim at single-shot object detectors and propose a location-aware anchor-based reasoning (LAAR) for the bounding boxes. LAAR takes both the location and classification confidences into consideration for the quality evaluation of bounding boxes. We introduce a novel network block to learn the relative location between the anchors and the ground truths, denoted as a localization score, which acts as a location reference during the inference stage. The proposed localization score leads to an independent regression branch and calibrates the bounding box quality by scoring the predicted localization score so that the best-qualified bounding boxes can be picked up in NMS. Experiments on MS COCO and PASCAL VOC benchmarks demonstrate that the proposed location-aware framework enhances the performances of current anchor-based single-shot object detection frameworks and yields consistent and robust detection results.

show abstract

ZoomCount: A Zooming Mechanism for Crowd Counting in Static Images

Cited by 47 publications

References 46 publications

S$^2$FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

S$^2$FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

A comparative study on polyp classification using convolutional neural networks

Location-Aware Box Reasoning for Anchor-Based Single-Shot Object Detection

Contact Info

Product

Resources

About