Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.165
|View full text |Cite
|
Sign up to set email alerts
|

FedED: Federated Learning via Ensemble Distillation for Medical Relation Extraction

Abstract: Unlike other domains, medical texts are inevitably accompanied by private information, so sharing or copying these texts is strictly restricted. However, training a medical relation extraction model requires collecting these privacy-sensitive texts and storing them on one machine, which comes in conflict with privacy protection. In this paper, we propose a privacypreserving medical relation extraction model based on federated learning, which enables training a central model with no single piece of private loca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
45
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 77 publications
(49 citation statements)
references
References 37 publications
0
45
0
Order By: Relevance
“…There are a few prior works starting to explore federated learning methods in privacy-preserving NLP applications, such as keyboard prediction (Hard et al, 2018;Leroy et al, 2019), intent classification (Zhu et al, 2020), pretraining and fine-tuning language model (Liu and Miller, 2020) and medical name entity recognition (Ge et al, 2020). Sui et al (2020) is most relevant to our work, which applies federated learning to supervised relation classification. But in their work, the data stored by the local platforms must be manually labeled in advance, which is difficult to be satisfied in practical application.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…There are a few prior works starting to explore federated learning methods in privacy-preserving NLP applications, such as keyboard prediction (Hard et al, 2018;Leroy et al, 2019), intent classification (Zhu et al, 2020), pretraining and fine-tuning language model (Liu and Miller, 2020) and medical name entity recognition (Ge et al, 2020). Sui et al (2020) is most relevant to our work, which applies federated learning to supervised relation classification. But in their work, the data stored by the local platforms must be manually labeled in advance, which is difficult to be satisfied in practical application.…”
Section: Related Workmentioning
confidence: 99%
“…But in their work, the data stored by the local platforms must be manually labeled in advance, which is difficult to be satisfied in practical application. Compared with Sui et al (2020), we combine federated learning with distant supervision, which can avoid such a unpractical assumption.…”
Section: Related Workmentioning
confidence: 99%
“…To test the proposed approach, we follow the convention of recent FL-based NLP studies (Liu et al, 2019;Huang et al, 2020b;Zhu et al, 2020;Sui et al, 2020) our simulation are heterogeneous), which is similar to the simulation setting of aforementioned previous studies. We split each genre into train/dev/test splits following Wang et al (2011) and report the statistics (in terms of the number of sentences, word tokens, and OOV rate) in Table 2.…”
Section: Simulationsmentioning
confidence: 99%
“…Unfortunately, limited attentions have been paid to address this issue. Most existing approaches (Liu et al, 2019;Huang et al, 2020b;Sui et al, 2020) with FL on NLP (e.g., for language modeling Figure 1: The server-node architecture of our approach. The encrypted information (i.e., encrypted data, word segmentation tags, and loss) communicates between a node and the server, where the locally stored data is inaccessible to other nodes during the training process.…”
Section: Introductionmentioning
confidence: 99%
“…data settings, i.e., some of the clients have signiicantly diferent data distribution than the others. Other approaches require some trusted clients or samples to guide the learning [27,35,43] or detect the updates from corrupted clients [16,21]. Unfortunately, the credibility of these trusted clients and samples are usually not guaranteed.…”
Section: Introductionmentioning
confidence: 99%