Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting

Wu, Qishi; Wan, Jia; Chan, Antoni B.

doi:10.1145/3474085.3475230

Cited by 13 publications

(6 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The styletransfer methods (Wang et al 2019(Wang et al , 2021Gao et al 2021Gao et al , 2019 narrow the domain gap by translating the synthetic images into photo-realistic images, but it is limited by the performance of the translation method. The feature-level adaptation methods (Gao, Yuan, and Wang 2020;Han et al 2020;Li, Yongbo, and Xiangyang 2019;Wu, Wan, and Chan 2021;Zou et al 2021;Wang et al 2022) measure the domain discrepancy by one or more discriminators to make data distributions across domains closer. The self-supervised methods (Liu, Durasov, and Fua 2022;Cai et al 2021) generate useful pseudo-labels on the target real images for finetuning purposes to retrain the counter.…”

Section: Cross Domain Crowd Countingmentioning

confidence: 99%

“…The first category transforms the source synthetic image into a photo-realistic intermediate image which is then trained in a supervised manner (Wang et al 2019(Wang et al , 2021Gao et al 2021). The second category aims to align the feature distributions between the source and target domains via one or more discriminators (Gao, Yuan, and Wang 2020;Han et al 2020;Li, Yongbo, and Xiangyang 2019;Wu, Wan, and Chan 2021;Zou et al 2021;Wang et al 2022). The third category generates useful pseudo labels for fine-tuning purposes to retrain the counter, which compromises on the accuracy and the incurred noisy pseudo labels (Liu, Durasov, and Fua 2022;Cai et al 2021).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Explicit Invariant Feature Induced Cross-Domain Crowd Counting

Cai

Chen

Guan

et al. 2023

AAAI

View full text Add to dashboard Cite

Cross-domain crowd counting has shown progressively improved performance. However, most methods fail to explicitly consider the transferability of different features between source and target domains. In this paper, we propose an innovative explicit Invariant Feature induced Cross-domain Knowledge Transformation framework to address the inconsistent domain-invariant features of different domains. The main idea is to explicitly extract domain-invariant features from both source and target domains, which builds a bridge to transfer more rich knowledge between two domains. The framework consists of three parts, global feature decoupling (GFD), relation exploration and alignment (REA), and graph-guided knowledge enhancement (GKE). In the GFD module, domain-invariant features are efficiently decoupled from domain-specific ones in two domains, which allows the model to distinguish crowds features from backgrounds in the complex scenes. In the REA module both inter-domain relation graph (Inter-RG) and intra-domain relation graph (Intra-RG) are built. Specifically, Inter-RG aggregates multi-scale domain-invariant features between two domains and further aligns local-level invariant features. Intra-RG preserves taskrelated specific information to assist the domain alignment. Furthermore, GKE strategy models the confidence of pseudolabels to further enhance the adaptability of the target domain. Various experiments show our method achieves state-of-theart performance on the standard benchmarks. Code is available at https://github.com/caiyiqing/IF-CKT.

show abstract

Section: Cross Domain Crowd Countingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Explicit Invariant Feature Induced Cross-Domain Crowd Counting

Cai

Chen

Guan

et al. 2023

AAAI

View full text Add to dashboard Cite

show abstract

“…To tackle the domain shift problem in crowd counting, many works rely on the domain adaptation (DA) technique (Wang et al 2019;Sindagi et al 2020;He et al 2021;Liu, Durasov, and Fua 2022;Gong et al 2022). DA adapts a model learned from a source domain to a target domain, where the target data is normally (at least partially) available, labeled or unlabeled, for model retraining or fine-tuning (Liu et al 2020;He et al 2021;Reddy et al 2021;Wu, Wan, and Chan 2021;. For instance, (Reddy et al 2020;Wang et al 2021a;Hossain et al 2019) have developed few-shot learning methods by assuming a few labeled data are available in the target domain for model finetuning.…”

Section: Crowd Countingmentioning

confidence: 99%

“…They are initialized and iteratively updated by learned image features in the network. (Wu, Wan, and Chan 2021) have utilized this type of memory to build a crowd counting model for DA where each memory vector is defined as the mean head feature per image. Their model does not address the varying size and number of heads in crowd images, hence memory vectors can be subject to noise.…”

Section: Domain Generalizationmentioning

confidence: 99%

Domain-General Crowd Counting in Unseen Scenarios

Deng

Shi

2023

AAAI

View full text Add to dashboard Cite

Domain shift across crowd data severely hinders crowd counting models to generalize to unseen scenarios. Although domain adaptive crowd counting approaches close this gap to a certain extent, they are still dependent on the target domain data to adapt (e.g. finetune) their models to the specific domain. In this paper, we instead target to train a model based on a single source domain which can generalize well on any unseen domain. This falls into the realm of domain generalization that remains unexplored in crowd counting. We first introduce a dynamic sub-domain division scheme which divides the source domain into multiple sub-domains such that we can initiate a meta-learning framework for domain generalization. The sub-domain division is dynamically refined during the meta-learning. Next, in order to disentangle domain-invariant information from domain-specific information in image features, we design the domain-invariant and -specific crowd memory modules to re-encode image features. Two types of losses, i.e. feature reconstruction and orthogonal losses, are devised to enable this disentanglement. Extensive experiments on several standard crowd counting benchmarks i.e. SHA, SHB, QNRF, and NWPU, show the strong generalizability of our method. Our code is available at: https://github.com/ZPDu/Domain-general-Crowd-Counting-in-Unseen-Scenarios

show abstract

“…Cross-domain / Multi-domain Learning. Many researchers exploit the cross-domain problems [40,41,42,43,25] in crowd counting, including cross-scene [32], cross-view [44], cross-modal [45], etc. Adversarial Scoring Network [41] is applied to adapt to the target domain from coarse to fine granularity.…”

Section: Related Workmentioning

confidence: 99%

Forget Less, Count Better: A Domain-Incremental Self-Distillation Learning Benchmark for Lifelong Crowd Counting

Gao¹,

Li²,

Shan³

et al. 2022

Preprint

View full text Add to dashboard Cite

Crowd Counting has important applications in public safety and pandemic control. A robust and practical crowd counting system has to be capable of continuously learning with the new-coming domain data in real-world scenarios instead of fitting one domain only. Off-the-shelf methods have some drawbacks to handle multiple domains. 1) The models will achieve limited performance (even drop dramatically) among old domains after training images from new domains due to the discrepancies of intrinsic data distributions from various domains, which is called catastrophic forgetting. 2) The well-trained model in a specific domain achieves imperfect performance among other unseen domains because of the domain shift. 3) It leads to linearly-increased storage overhead either mixing all the data for training or simply training dozens of separate models for different domains when new ones are available. To overcome these issues, we investigate a new task of crowd counting under the incremental domains training setting, namely, Lifelong Crowd Counting. It aims at alleviating the catastrophic forgetting and improving the generalization ability using a single model updated by the incremental domains. To be more specific, we propose a selfdistillation learning framework as a benchmark (Forget Less, Count Better, FLCB) for lifelong crowd counting, which helps the model sustainably leverage previous meaningful knowledge for better crowd counting to mitigate the forgetting when the new data arrive. Meanwhile, a new quantitative metric, normalized backward transfer (nBwT), is developed to evaluate the forgetting degree of the model in the lifelong learning process. Extensive experimental results demonstrate the superiority of our proposed benchmark in achieving a low catastrophic forgetting degree and strong generalization ability.

show abstract

Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting

Abstract: model fine-tuning, while also outperforming domain adaptation methods that use fine-tuning on target domain data. Moreover, C 2 MoT also obtains state-of-the-art counting performance on the source domain.

Cited by 13 publications

References 40 publications

Explicit Invariant Feature Induced Cross-Domain Crowd Counting

Explicit Invariant Feature Induced Cross-Domain Crowd Counting

Domain-General Crowd Counting in Unseen Scenarios

Forget Less, Count Better: A Domain-Incremental Self-Distillation Learning Benchmark for Lifelong Crowd Counting

Contact Info

Product

Resources

About