2020
DOI: 10.1200/cci.19.00047
|View full text |Cite
|
Sign up to set email alerts
|

Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care

Abstract: Big data for health care is one of the potential solutions to deal with the numerous challenges of health care, such as rising cost, aging population, precision medicine, universal health coverage, and the increase of noncommunicable diseases. However, data centralization for big data raises privacy and regulatory concerns. Covered topics include (1) an introduction to privacy of patient data and distributed learning as a potential solution to preserving these data, a description of the legal context for pati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
55
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

4
5

Authors

Journals

citations
Cited by 87 publications
(55 citation statements)
references
References 64 publications
0
55
0
Order By: Relevance
“…Thirdly, the threshold to go to the hospital and hospitalisation management can vary from country to country, and we are also aware that RNA viruses can mutate rapidly and could have an impact of the performance of the models. We therefore propose that those models should be continuously updated to achieve a better performance for example using privacy preserving distributed learning approaches [ 32 , 33 ]. Fourthly, the CT features used for this study are semantic features from the first CT scan, and radiomics or deep learning approaches may improve its prognostic performance, and follow-up CT scans may yield more information.…”
Section: Discussionmentioning
confidence: 99%
“…Thirdly, the threshold to go to the hospital and hospitalisation management can vary from country to country, and we are also aware that RNA viruses can mutate rapidly and could have an impact of the performance of the models. We therefore propose that those models should be continuously updated to achieve a better performance for example using privacy preserving distributed learning approaches [ 32 , 33 ]. Fourthly, the CT features used for this study are semantic features from the first CT scan, and radiomics or deep learning approaches may improve its prognostic performance, and follow-up CT scans may yield more information.…”
Section: Discussionmentioning
confidence: 99%
“…We therefore propose that those models should be continuously updated for example using privacy-preserving distributed learning approaches. 29,30 Fourth, the CT features used for this study are semantic features from the first CT scan, and quantitative features automatically extracted from CT images using radiomics or deep learning approaches may improve its prognostic performance, and follow-up CT scan may yield more information. Finally, there is also the fundamental weakness of nomograms, which do not give a confidence interval to the final output.…”
Section: Discussionmentioning
confidence: 99%
“…Similar works have been demonstrated by Kuo et al [45], [46] leveraging blockchain using Logistic Regression machine learning models. While these pipelines [13,32,33] overcome the risk of exposing the model weights, Lugan et al [47] proposed to train distributed learning models on encrypted data, preventing any exposure of local weights. Nevertheless, when implementing deep learning and encrypting model weights, model design requires careful consideration as aspects such as the CNN activation functions must be adapted [48].…”
Section: Discussion and Future Workmentioning
confidence: 99%
“…This approach, proposed in 2013, is known as distributed learning (federated learning) [11], [12]. Distributed learning -a fusion of machine learning and distributed computing -allows machine learning models to be trained on multiple siloed datasets without the need for patient data to leave the firewalls of each database [13]. Distributed learning preserves privacy by design, by sharing model weights for subsequent training cycles instead of privacy sensitive data.…”
Section: Introductionmentioning
confidence: 99%