Knowledge Selection and Local Updating Optimization for Federated Knowledge Distillation With Heterogeneous Models

Wang, Dong; Zhang, Naifu; Tao, Meixia; Chen, Xu

doi:10.1109/jstsp.2022.3223526

Cited by 4 publications

(6 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the experiments, ten clients participate in the distillation process, and we evaluate the model’s performance under two non-IID distribution settings: a strong non-IID setting and a weak non-IID setting, where each client has one unique class and two classes, respectively. Several representative federated distillation methods are compared, including FedMD 13 , FedED 19 , DS-FL 20 , FKD 34 , and PLS 26 . Among them, FedMD, FedED, and DS-FL rely on a proxy dataset to transfer knowledge, while FKD and PLS are data-free KD approaches that share class-wise average predictions among users.…”

Section: Resultsmentioning

confidence: 99%

“…Nevertheless, without a well-trained teacher, FD relies on the ensemble of local predictors for distillation, making it sensitive to the training state of local models, which may suffer from poor quality and underfitting. Besides, the non-identically independently distributed (non-IID) data distributions 24 , 25 across clients exacerbate this issue, since the local models cannot output accurate predictions on the proxy samples that are outside their local distributions 26 . To address the negative impact of misleading knowledge, an alternative is to incorporate soft labels (i.e., normalized logits) 17 during knowledge distillation to enhance the generalization performance.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Selective knowledge sharing for privacy-preserving federated distillation without a good teacher

Shao,

Wu,

Zhang

2024

Nat Commun

View full text Add to dashboard Cite

While federated learning (FL) is promising for efficient collaborative learning without revealing local data, it remains vulnerable to white-box privacy attacks, suffers from high communication overhead, and struggles to adapt to heterogeneous models. Federated distillation (FD) emerges as an alternative paradigm to tackle these challenges, which transfers knowledge among clients instead of model parameters. Nevertheless, challenges arise due to variations in local data distributions and the absence of a well-trained teacher model, which leads to misleading and ambiguous knowledge sharing that significantly degrades model performance. To address these issues, this paper proposes a selective knowledge sharing mechanism for FD, termed Selective-FD, to identify accurate and precise knowledge from local and ensemble predictions, respectively. Empirical studies, backed by theoretical insights, demonstrate that our approach enhances the generalization capabilities of the FD framework and consistently outperforms baseline methods. We anticipate our study to enable a privacy-preserving, communication-efficient, and heterogeneity-adaptive federated training framework.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Selective knowledge sharing for privacy-preserving federated distillation without a good teacher

Shao,

Wu,

Zhang

2024

Nat Commun

View full text Add to dashboard Cite

show abstract

“…In the experiments, ten clients participate in the distillation process, and we evaluate the model's performance under two non-IID distribution settings across clients: a strong non-IID setting and a weak non-IID setting, where each client has one unique class and two classes, respectively. Several representative federated distillation methods are compared, including FedMD 12 , FedED 17 , DS-FL 18 , FKD 31 , and PLS 23 . Among them, FedMD, FedED, and DS-FL rely on a proxy dataset to transfer knowledge, while FKD and PLS are data-free KD approaches that share class-wise average predictions among users.…”

Section: Performance Evaluationmentioning

confidence: 99%

“…Despite the potential for improving efficiency and privacy, FD is sensitive to the training state of local models due to the lack of a well-trained teacher, where the ensemble predictions may have low quality due to the under-fitted local predictors. Besides, the non-identically independently distributed (non-IID) data distributions 21,22 across clients exacerbate this issue, since the local models cannot output accurate predictions on the proxy samples that are outside their local distributions 23 .…”

mentioning

confidence: 99%

Selective Knowledge Sharing for Privacy-Preserving Federated Distillation without A Good Teacher

Zhang

Shao

2023

Preprint

View full text Add to dashboard Cite

While federated learning is promising for privacy-preserving collaborative learning without revealing local data, it remains vulnerable to white-box attacks and struggles to adapt to heterogeneous clients. Federated distillation (FD), built upon knowledge distillation--an effective technique for transferring knowledge from a teacher model to student models--emerges as an alternative paradigm, which provides enhanced privacy guarantees and addresses model heterogeneity. Nevertheless, challenges arise due to variations in local data distributions and the absence of a well-trained teacher model, which leads to misleading and ambiguous knowledge sharing that significantly degrades model performance. To address these issues, this paper proposes a selective knowledge sharing mechanism for FD, termed Selective-FD. It includes client-side selectors and a server-side selector to accurately and precisely identify knowledge from local and ensemble predictions, respectively. Empirical studies, backed by theoretical insights, demonstrate that our approach enhances the generalization capabilities of the FD framework and consistently outperforms baseline methods. This study presents a promising direction for effective knowledge transfer in privacy-preserving collaborative learning.

show abstract

“…(2) KD-based FL needs a dataset for distillation, which can be client private data [59], publicly available data [10], or artificially generated synthetic data [181]. (3) Typically, KD-based FL lacks a pre-trained teacher model [146], and the initial training performance of the teacher model is suboptimal. However, the teacher model gradually improves reliability and convergence as the training progresses.…”

Section: Kd-based Flmentioning

confidence: 99%

The Construction of City Image of Macau as Represented in the Film “A City Called Macau”

Qin¹

2021

Journal of Macau University of Science and Technology

View full text Add to dashboard Cite

Since its birth, fi lm has always been closely related to the city. People naturally equate their understanding of a city with its interpretation, perception and cognition presented in the fi lm. "A City Called Macau" is selected as the research subject, and the documentary method is used to analyze the shot of urban Macau sceneries, in order to examine and attest the representation of the city in film language: Firstly, narrative structure is applied to interpret the film; Secondly, the pictorial image of the city is interpreted; Thirdly, the texts and sounds of the shot are also interpreted.This research holds that Macau, as a city of casino, has a very clear image in a variety of fi lms. However, diverse images are shown in this fi lm: sometimes a gentle and peaceful town, sometimes a prosperous and crazy metropolis, and sometimes an extremely romantic utopia. Though its interpretation of Macau's characteristics from different aspects, "A City Called Macau" leaves deep an impression to the audience.

show abstract

Knowledge Selection and Local Updating Optimization for Federated Knowledge Distillation With Heterogeneous Models

Cited by 4 publications

References 15 publications

Selective knowledge sharing for privacy-preserving federated distillation without a good teacher

Selective knowledge sharing for privacy-preserving federated distillation without a good teacher

Selective Knowledge Sharing for Privacy-Preserving Federated Distillation without A Good Teacher

The Construction of City Image of Macau as Represented in the Film “A City Called Macau”

Contact Info

Product

Resources

About