Towards A Universal Model for Cross-Dataset Crowd Counting

Ma, Zhiheng; Hong, Xiaopeng; Wei, Xing; Qiu, Yunfeng; Gong, Yi

doi:10.1109/iccv48922.2021.00319

Cited by 45 publications

(15 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lin et al [45] further improve the loss function based on Sinkhorn distance. More improvements such as incorporating perspective information [41], [46], auxiliary task [34], [47], cross-datasets training [48], [49] and neural architecture search [50] further promote the counting performance. However, as revealed in [51], [52], designing powerful deep architectures remains an active topic in crowd counting.…”

Section: A Crowd Countingmentioning

confidence: 99%

Scene-Adaptive Attention Network for Crowd Counting

Wang¹,

Kang²,

Yang³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

In recent years, significant progress has been made on the research of crowd counting. However, as the challenging scale variations and complex scenes existed in crowds, neither traditional convolution networks nor near recent Transformer architectures with fixed-size attention could handle the task well. To address this problem, this paper proposes a sceneadaptive attention network, termed SAANet. First of all, we design a deformable attention in-built Transformer backbone, which learns adaptive feature representations with deformable sampling locations and dynamic attention weights. Then we propose the multi-level feature fusion and count-attentive feature enhancement modules further, to strengthen feature representation under the global image context. The learned representations could attend to the foreground and are adaptive to different scales of crowds. We conduct extensive experiments on four challenging crowd counting benchmarks, demonstrating that our method achieves state-of-the-art performance. Especially, our method currently ranks No.1 on the public leaderboard of the NWPU-Crowd benchmark. We hope our method could be a strong baseline to support future research in crowd counting. The source code will be released to the community.

show abstract

Section: A Crowd Countingmentioning

confidence: 99%

Scene-Adaptive Attention Network for Crowd Counting

Wang¹,

Kang²,

Yang³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Object counting strives to estimate the total number of objects dispersing across still images [35] or dynamic video sequences [25]. It has increasingly drawn attention from computer vision community, thanks to its wide spread societal applications, e.g., social distance monitoring [28], traffic surveillance [48], counting in agriculture [24] and metropolis crowd management [26].…”

Section: Introductionmentioning

confidence: 99%

CrowdMLP: Weakly-Supervised Crowd Counting via Multi-Granularity MLP

Wang¹,

Zhou²,

Cai³

et al. 2022

Preprint

View full text Add to dashboard Cite

Existing state-of-the-art crowd counting algorithms rely excessively on location-level annotations, which are burdensome to acquire. When only count-level (weak) supervisory signals are available, it is arduous and error-prone to regress total counts due to the lack of explicit spatial constraints. To address this issue, a novel and efficient counter (referred to as CrowdMLP) is presented, which probes into modelling global dependencies of embeddings and regressing total counts by devising a multi-granularity MLP regressor. In specific, a locally-focused pre-trained frontend is cascaded to extract crude feature maps with intrinsic spatial cues, which prevent the model from collapsing into trivial outcomes. The crude embeddings, along with raw crowd scenes, are tokenized at different granularity levels. The multi-granularity MLP then proceeds to mix tokens at the dimensions of cardinality, channel, and spatial for mining global information. An effective proxy task, namely Split-Counting, is also proposed to evade the barrier of limited samples and the shortage of spatial hints in a self-supervised manner. Extensive experiments demonstrate that CrowdMLP significantly outperforms existing weakly-supervised counting algorithms and performs on par with state-of-the-art location-level supervised approaches.

show abstract

“…In recent years, typical counting methods [20,21,41,50] utilize the Convolution Neural Network (CNN) as backbone and regress density map to predict the total crowd count. However, due to the wide viewing angle of cameras and the 2D perspective projection, large-scale variations often exist in crowd images.…”

Section: Introductionmentioning

confidence: 99%

Boosting Crowd Counting via Multifaceted Attention

Lin¹,

Ma²,

Ji³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

This paper focuses on the challenging crowd counting task. As large-scale variations often exist within crowd images, neither fixed-size convolution kernel of CNN nor fixed-size attention of recent vision transformers can well handle this kind of variations. To address this problem, we propose a Multifaceted Attention Network (MAN) to improve transformer models in local spatial relation encoding. MAN incorporates global attention from vanilla transformer, learnable local attention, and instance attention into a counting model. Firstly, the local Learnable Region Attention (LRA) is proposed to assign attention exclusive for each feature location dynamically. Secondly, we design the Local Attention Regularization to supervise the training of LRA by minimizing the deviation among the attention for different feature locations. Finally, we provide an Instance Attention mechanism to focus on the most important instances dynamically during training. Extensive experiments on four challenging crowd counting datasets namely ShanghaiTech, UCF-QNRF, JHU++, and NWPU have validated the proposed method. Code: https://github.com/LoraLinH/Boosting-Crowd-Counting-via-Multifaceted-Attention.

show abstract

Towards A Universal Model for Cross-Dataset Crowd Counting

Cited by 45 publications

References 44 publications

Scene-Adaptive Attention Network for Crowd Counting

Scene-Adaptive Attention Network for Crowd Counting

CrowdMLP: Weakly-Supervised Crowd Counting via Multi-Granularity MLP

Boosting Crowd Counting via Multifaceted Attention

Contact Info

Product

Resources

About