While machine learning techniques are being applied to various fields for their exceptional ability to find complex relations in large datasets, the strengthening of regulations on data ownership and privacy is causing increasing difficulty in its application to medical data. In light of this, Federated Learning has recently been proposed as a solution to train on private data without breach of confidentiality. This conservation of privacy is particularly appealing in the field of healthcare, where patient data is highly confidential. However, many studies have shown that its assumption of Independent and Identically Distributed data is unrealistic for medical data. In this paper, we propose Personalized Federated Cluster Models, a hierarchical clusteringbased FL process, to predict Major Depressive Disorder severity from Heart Rate Variability. By allowing clients to receive more personalized model, we address problems caused by non-IID data, showing an accuracy increase in severity prediction. This increase in performance may be sufficient to use Personalized Federated Cluster Models in many existing Federated Learning scenarios.
Purpose This study aims to summarize the critical issues in medical federated learning and applicable solutions. Also, detailed explanations of how federated learning techniques can be applied to the medical field are presented. About 80 reference studies described in the field were reviewed, and the federated learning framework currently being developed by the research team is provided. This paper will help researchers to build an actual medical federated learning environment. Design/methodology/approach Since machine learning techniques emerged, more efficient analysis was possible with a large amount of data. However, data regulations have been tightened worldwide, and the usage of centralized machine learning methods has become almost infeasible. Federated learning techniques have been introduced as a solution. Even with its powerful structural advantages, there still exist unsolved challenges in federated learning in a real medical data environment. This paper aims to summarize those by category and presents possible solutions. Findings This paper provides four critical categorized issues to be aware of when applying the federated learning technique to the actual medical data environment, then provides general guidelines for building a federated learning environment as a solution. Originality/value Existing studies have dealt with issues such as heterogeneity problems in the federated learning environment itself, but those were lacking on how these issues incur problems in actual working tasks. Therefore, this paper helps researchers understand the federated learning issues through examples of actual medical machine learning environments.
Federated Learning is a distributed machine learning framework designed for data privacy preservation i.e., local data remain private throughout the entire training and testing procedure. Federated Learning is gaining popularity because it allows one to use machine learning techniques while preserving privacy. However, it inherits the vulnerabilities and susceptibilities raised in deep learning techniques. For instance, Federated Learning is particularly vulnerable to data poisoning attacks that may deteriorate its performance and integrity due to its distributed nature and inaccessibility to the raw data. In addition, it is extremely difficult to correctly identify malicious clients due to the non-Independently and/or Identically Distributed (non-IID) data. The real-world data can be complex and diverse, making them hardly distinguishable from the malicious data without direct access to the raw data. Prior research has focused on detecting malicious clients while treating only the clients having IID data as benign. In this study, we propose a method that detects and classifies anomalous clients from benign clients when benign ones have non-IID data. Our proposed method leverages feature dimension reduction, dynamic clustering, and cosine similarity-based clipping. The experimental results validates that our proposed method not only classifies the malicious clients but also alleviates their negative influences from the entire procedure. Our findings may be used in future studies to effectively eliminate anomalous clients when building a model with diverse data. CCS CONCEPTS• Security and privacy; • Computing methodologies → Machine learning; Distributed artificial intelligence;
Since the federated learning, which makes AI learning possible without moving local data around, was introduced by google in 2017 it has been actively studied particularly in the field of medicine. In fact, the idea of machine learning in AI without collecting data from local clients is very attractive because data remain in local sites. However, federated learning techniques still have various open issues due to its own characteristics such as non identical distribution, client participation management, and vulnerable environments. In this presentation, the current issues to make federated learning flawlessly useful in the real world will be briefly overviewed. They are related to data/system heterogeneity, client management, traceability, and security. Also, we introduce the modularized federated learning framework, we currently develop, to experiment various techniques and protocols to find solutions for aforementioned issues. The framework will be open to public after development completes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.