Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks

Jiang, Xiaolong; Xiao, Zehao; Zhang, Baochang; Zhen, Xiantong; Cao, Xianbin; Doermann, David; Shao, Ling

doi:10.1109/cvpr.2019.00629

Cited by 321 publications

(176 citation statements)

References 54 publications

Supporting

Mentioning

175

Contrasting

Order By: Relevance

“…Most recently, several methods have focused on incorporating additional cues such as segmentation and semantic priors [61,75], attention [31,54,58], perspective [50], context information respectively [33], multiple-views [70] and multi-scale features [20] into the network. Wang et al [63] introduced a new synthetic dataset and proposed a SSIM based CycleGAN [78] to adapt the synthetic datasets to real world dataset.…”

Section: Related Workmentioning

confidence: 99%

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Sindagi

Patel

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

170

View full text Add to dashboard Cite

Crowd counting presents enormous challenges in the form of large variation in scales within images and across the dataset. These issues are further exacerbated in highly congested scenes. Approaches based on straightforward fusion of multi-scale features from a deep network seem to be obvious solutions to this problem. However, these fusion approaches do not yield significant improvements in the case of crowd counting in congested scenes. This is usually due to their limited abilities in effectively combining the multi-scale features for problems like crowd counting. To overcome this, we focus on how to efficiently leverage information present in different layers of the network. Specifically, we present a network that involves: (i) a multilevel bottom-top and top-bottom fusion (MBTTBF) method to combine information from shallower to deeper layers and vice versa at multiple levels, (ii) scale complementary feature extraction blocks (SCFB) involving cross-scale residual functions to explicitly enable flow of complementary features from adjacent conv layers along the fusion paths. Furthermore, in order to increase the effectiveness of the multi-scale fusion, we employ a principled way of generating scale-aware ground-truth density maps for training. Experiments conducted on three datasets that contain highly congested scenes (ShanghaiTech, UCF CROWD 50, and UCF-QNRF) demonstrate that the proposed method is able to outperform several recent methods in all the datasets.

show abstract

Section: Related Workmentioning

confidence: 99%

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Sindagi

Patel

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

170

View full text Add to dashboard Cite

show abstract

“…Recent approaches like [22,47,48,51,62] have aimed at incorporating various forms of related information like attention [22], semantic priors [51], segmentation [62], inverse attention [48], and hierarchical attention [47] respectively into the network. Other techniques such as [12,23,40,60] leverage features from different layers of the network using different techniques like trellis style encoder decoder [12], explicitly considering perspective [40], context information [23], and multiple views [60]. Crowd Datasets.…”

Section: Related Workmentioning

confidence: 99%

“…Method MAE MSE Idrees et al [10] 315.0 508.0 Zhang et al [59] 277.0 426.0 CMTL et al [43] 252.0 514.0 Switching-CNN [38] 228.0 445.0 Idrees et al [11] 132.0 191.0 Jian et al [12] 113.0 188.0 CG-DRCN (proposed) 112.2 176.3 column network (MCNN) [61], cascaded multi-task learning for crowd counting (CMTL) [43], Switching-CNN [38], CSR-Net [20] and SANet [4] 2 . Furthermore, we also evaluate the proposed method (CG-DRCN) and demonstrate its effectiveness over the other methods.…”

Section: Jhu-crowd Datasetmentioning

confidence: 99%

“…ShanghaiTech: The proposed network is trained on the train splits using the same strategy as discussed in Section 3.4. Table 5 shows the results of the proposed method on ShanghaiTech as compared with several recent approaches ( [38], [44], [2], [41], [24], [20], [33] , [4], [39] and [12]). It can be observed that the proposed method outperforms all existing methods on Part A of the dataset, while achieving comparable performance on Part B.…”

Section: Comparison On Other Datasetsmentioning

confidence: 99%

“…Results on the UCF-QNRF [11] dataset as compared with recent methods ( [10], [61], [43]) are shown in Table 6. The proposed method is compared against different approaches: [10], [61], [43], [38], [11] and [12]. It can be observed that the proposed method outperforms other methods by a considerable margin.…”

Section: Ucf-qnrfmentioning

confidence: 99%

See 2 more Smart Citations

Pushing the Frontiers of Unconstrained Crowd Counting: New Dataset and Benchmark Method

Sindagi

Yasarla

Patel

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

In this work, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements in errors.Furthermore, we introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD) that is ∼2.8 × larger than the most recent crowd counting datasets in terms of the number of images. It contains 4,250 images with 1.11 million annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weatherbased degradations and illumination variations in addition to many distractor images, making it a very challenging dataset. Additionally, the dataset consists of rich annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset.

show abstract

Visual crowd analysis: Open research problems

Khan,

Menouar,

Hamila

2023

AI Magazine

View full text Add to dashboard Cite

Over the last decade, there has been a remarkable surge in interest in automated crowd monitoring within the computer vision community. Modern deep‐learning approaches have made it possible to develop fully automated vision‐based crowd‐monitoring applications. However, despite the magnitude of the issue at hand, the significant technological advancements, and the consistent interest of the research community, there are still numerous challenges that need to be overcome. In this article, we delve into six major areas of visual crowd analysis, emphasizing the key developments in each of these areas. We outline the crucial unresolved issues that must be tackled in future works, in order to ensure that the field of automated crowd monitoring continues to progress and thrive. Several surveys related to this topic have been conducted in the past. Nonetheless, this article thoroughly examines and presents a more intuitive categorization of works, while also depicting the latest breakthroughs within the field, incorporating more recent studies carried out within the last few years in a concise manner. By carefully choosing prominent works with significant contributions in terms of novelty or performance gains, this paper presents a more comprehensive exposition of advancements in the current state‐of‐the‐art.

show abstract

Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks

Cited by 321 publications

References 54 publications

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Pushing the Frontiers of Unconstrained Crowd Counting: New Dataset and Benchmark Method

Visual crowd analysis: Open research problems

Contact Info

Product

Resources

About