Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475230
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting

Abstract: model fine-tuning, while also outperforming domain adaptation methods that use fine-tuning on target domain data. Moreover, C 2 MoT also obtains state-of-the-art counting performance on the source domain.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 40 publications
0
3
0
Order By: Relevance
“…The styletransfer methods (Wang et al 2019(Wang et al , 2021Gao et al 2021Gao et al , 2019 narrow the domain gap by translating the synthetic images into photo-realistic images, but it is limited by the performance of the translation method. The feature-level adaptation methods (Gao, Yuan, and Wang 2020;Han et al 2020;Li, Yongbo, and Xiangyang 2019;Wu, Wan, and Chan 2021;Zou et al 2021;Wang et al 2022) measure the domain discrepancy by one or more discriminators to make data distributions across domains closer. The self-supervised methods (Liu, Durasov, and Fua 2022;Cai et al 2021) generate useful pseudo-labels on the target real images for finetuning purposes to retrain the counter.…”
Section: Cross Domain Crowd Countingmentioning
confidence: 99%
See 1 more Smart Citation
“…The styletransfer methods (Wang et al 2019(Wang et al , 2021Gao et al 2021Gao et al , 2019 narrow the domain gap by translating the synthetic images into photo-realistic images, but it is limited by the performance of the translation method. The feature-level adaptation methods (Gao, Yuan, and Wang 2020;Han et al 2020;Li, Yongbo, and Xiangyang 2019;Wu, Wan, and Chan 2021;Zou et al 2021;Wang et al 2022) measure the domain discrepancy by one or more discriminators to make data distributions across domains closer. The self-supervised methods (Liu, Durasov, and Fua 2022;Cai et al 2021) generate useful pseudo-labels on the target real images for finetuning purposes to retrain the counter.…”
Section: Cross Domain Crowd Countingmentioning
confidence: 99%
“…The first category transforms the source synthetic image into a photo-realistic intermediate image which is then trained in a supervised manner (Wang et al 2019(Wang et al , 2021Gao et al 2021). The second category aims to align the feature distributions between the source and target domains via one or more discriminators (Gao, Yuan, and Wang 2020;Han et al 2020;Li, Yongbo, and Xiangyang 2019;Wu, Wan, and Chan 2021;Zou et al 2021;Wang et al 2022). The third category generates useful pseudo labels for fine-tuning purposes to retrain the counter, which compromises on the accuracy and the incurred noisy pseudo labels (Liu, Durasov, and Fua 2022;Cai et al 2021).…”
Section: Introductionmentioning
confidence: 99%
“…To tackle the domain shift problem in crowd counting, many works rely on the domain adaptation (DA) technique (Wang et al 2019;Sindagi et al 2020;He et al 2021;Liu, Durasov, and Fua 2022;Gong et al 2022). DA adapts a model learned from a source domain to a target domain, where the target data is normally (at least partially) available, labeled or unlabeled, for model retraining or fine-tuning (Liu et al 2020;He et al 2021;Reddy et al 2021;Wu, Wan, and Chan 2021;. For instance, (Reddy et al 2020;Wang et al 2021a;Hossain et al 2019) have developed few-shot learning methods by assuming a few labeled data are available in the target domain for model finetuning.…”
Section: Crowd Countingmentioning
confidence: 99%
“…They are initialized and iteratively updated by learned image features in the network. (Wu, Wan, and Chan 2021) have utilized this type of memory to build a crowd counting model for DA where each memory vector is defined as the mean head feature per image. Their model does not address the varying size and number of heads in crowd images, hence memory vectors can be subject to noise.…”
Section: Domain Generalizationmentioning
confidence: 99%
“…Cross-domain / Multi-domain Learning. Many researchers exploit the cross-domain problems [40,41,42,43,25] in crowd counting, including cross-scene [32], cross-view [44], cross-modal [45], etc. Adversarial Scoring Network [41] is applied to adapt to the target domain from coarse to fine granularity.…”
Section: Related Workmentioning
confidence: 99%