Proceedings of the ACM Web Conference 2023 2023
DOI: 10.1145/3543507.3583518
|View full text |Cite
|
Sign up to set email alerts
|

pFedPrompt: Learning Personalized Prompt for Vision-Language Models in Federated Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…FedCLIP [ 38 ] added an adapter module after the CLIP backbone to achieve the efficient deployment of the CLIP model [ 70 ] with federated clients. Some studies [ 19 , 35 ] have utilized the idea of prompt training to aggregate the user consensus via a learnable prompt and improve the users’ characteristics in the visual domain. Improving the ability to integrate large-scale pre-trained models will greatly enhance the performance of MFL systems.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…FedCLIP [ 38 ] added an adapter module after the CLIP backbone to achieve the efficient deployment of the CLIP model [ 70 ] with federated clients. Some studies [ 19 , 35 ] have utilized the idea of prompt training to aggregate the user consensus via a learnable prompt and improve the users’ characteristics in the visual domain. Improving the ability to integrate large-scale pre-trained models will greatly enhance the performance of MFL systems.…”
Section: Discussionmentioning
confidence: 99%
“…CreamFL [ 37 ] allowed both unimodal and multimodal vision–language tasks in federated systems. pFedPrompt [ 35 ] adapted the prompt training method to leverage large foundation models into federated learning systems to connect vision and language data. FedCMR [ 11 ] explored the federated cross-modal retrieval task and mitigated the representation space gap via weighted aggregation based on the local data amount and category number.…”
Section: Tasks For Multimodal Federated Learningmentioning
confidence: 99%
“…(1) Image classification. Sun et al 2022) evaluate the existing PEFT baselines combined with FL, while (Guo et al 2022;Guo, Guo, and Wang 2023;Li et al 2023;Lu et al 2023) finetune the CLIP model (Radford et al 2021) via tuning and communicating only small amount of learnable (personalized) prompts. (Su et al 2022) addresses the problem of heterogeneous client images by injecting lightweight adaptation modules (adapters) (Houlsby et al 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Existing works have predominantly explored a basic combination of centralized PEFT algorithms and FedAvg. For instance, some approaches focus on training and communicating only the tiny adaptation modules (adapter) (Houlsby et al 2019;Su et al 2022) or a small amount of trainable input tokens (Guo et al 2022;Guo, Guo, and Wang 2023). However, these investigations are limited to single modality scenarios, where only visual or textual tasks are considered.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, FL algorithms like FedAvg and SCAFFOLD can be enhanced using momentum, leading to improved convergence rates and performance, even with varying data heterogeneity and partial client participation [154]. The authors of [155] introduced personalized federated learning (pFL) and demonstrated its application in tailoring models for diverse users within a decentralized system. Additionally, they introduced the Contextual Optimization (CoOp) method for fine-tuning pre-trained vision-language models.…”
Section: Federated Learningmentioning
confidence: 99%