Learning Privately over Distributed Features: An ADMM Sharing Approach

Hu, Yaochen; Liu, Peng; Kong, Linglong; Niu, Di

doi:10.48550/arxiv.1907.07735

Cited by 7 publications

(14 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result, the distributed algorithms developed for (P) can be applied to solving both problems (4) and (7).…”

Section: Application Examplesmentioning

confidence: 99%

“…The cross-entropy loss function [38] is applied in the last layer. Then, the proposed PDC algorithm and IPDC algorithm are applied to train the classification NN, respectively, by solving problem (7). For the PDC/IPDC algorithms, it is set that p = 0.01, β = 0.01, ρ ∈ {100, 10, 1, 10 −1 } and α ∈ {10 −4 , 10 −3 , 10 −2 , 10 −1 }.…”

Section: Distributed Neural Networkmentioning

confidence: 99%

“…The agent variables are coupled due to the linear constraint N i=1 B i x i = q. Such problem arises, for example, in machine learning over distributed features [6,7], distributed power control in electrical power networks [8,9], interference management in wireless networks [10], and the network utility maximization problem [11].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Decentralized Non-Convex Learning with Linearly Coupled Constraints

Zhang

Chang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Motivated by the need for decentralized learning, this paper aims at designing a distributed algorithm for solving nonconvex problems with general linear constraints over a multi-agent network. In the considered problem, each agent owns some local information and a local variable for jointly minimizing a cost function, but local variables are coupled by linear constraints. Most of the existing methods for such problems are only applicable for convex problems or problems with specific linear constraints. There still lacks a distributed algorithm for such problems with general linear constraints and under nonconvex setting. In this paper, to tackle this problem, we propose a new algorithm, called "proximal dual consensus" (PDC) algorithm, which combines a proximal technique and a dual consensus method. We build the theoretical convergence conditions and show that the proposed PDC algorithm can converge to an -Karush-Kuhn-Tucker solution within O(1/ ) iterations. For computation reduction, the PDC algorithm can choose to perform cheap gradient descent per iteration while preserving the same order of O(1/ ) iteration complexity. Numerical results are presented to demonstrate the good performance of the proposed algorithms for solving a regression problem and a classification problem over a network where agents have only partial observations of data features. * Jiawei Zhang and Songyang Ge contribute equally. Part of the work was presented in IEEE ICASSP 2020 [1].

show abstract

“…As a result, the distributed algorithms developed for (P) can be applied to solving both problems (4) and (7).…”

Section: Application Examplesmentioning

confidence: 99%

Section: Distributed Neural Networkmentioning

confidence: 99%

See 1 more Smart Citation

Decentralized Non-Convex Learning with Linearly Coupled Constraints

Zhang

Chang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Most studies (e.g., [24,25,39,47,57,70]) in VFL focus on training and simply assumes record linkage has been done (i.e., implicit exact linkage on record ID), which is impractical since most real-world federated datasets are unlinked. Some approaches exactly link the identifiers by exact PPRL [9] or private set intersection (PSI) [9,36,37,49,62].…”

Section: Related Workmentioning

confidence: 99%

“…However, they also focus only on the most similar identifiers and assume there is a one-to-one mapping between the data records of two parties, which is not always true in practice. Current VFL frameworks support various machine learning models including linear regression [16], logistic regression [24], support vector machine [35], gradient boosting decision trees [9,62]. FDML [25] supports neural networks but it requires all the parties to hold labels.…”

Section: Related Workmentioning

confidence: 99%

A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning

Wu¹,

Li²,

He³

2021

Preprint

View full text Add to dashboard Cite

As the privacy of machine learning has drawn increasing attention, federated learning is introduced to enable collaborative learning without revealing raw data. Notably, vertical federated learning (VFL), where parties share the same set of samples but only hold partial features, has a wide range of real-world applications. However, existing studies in VFL rarely study the "record linkage" process. They either design algorithms assuming the data from different parties have been linked or use simple linkage methods like exact-linkage or top1-linkage. These approaches are unsuitable for many applications, such as the GPS location and noisy titles requiring fuzzy matching. In this paper, we design a novel similarity-based VFL framework, FedSim, which is suitable for more real-world applications and achieves higher performance on traditional VFL tasks. Moreover, we theoretically analyze the privacy risk caused by sharing similarities. Our experiments on three synthetic datasets and five real-world datasets with various similarity metrics show that FedSim consistently outperforms other state-of-the-art baselines.Preprint. Under review.

show abstract

Privacy-Preserving Federated Learning Model for Healthcare Data

Islam

Ghasemi

Mohammed

2022

2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC)

View full text Add to dashboard Cite

Federated Learning (FL) is a method for training machine learning algorithms on decentralized data where sharing the raw data is not feasible due to privacy regulations. An instance of such data is Electronic Health Records (EHRs), which contain confidential patient information. In FL, the sensitive data is not shared, rather local models are trained and the model parameters are then aggregated on a central server. However, this method presents privacy challenges, necessitating the implementation of privacy protection strategies, such as data anonymization, before sharing the model parameters. Balancing the trade-off between privacy and utility is a crucial aspect in FL research, as integrating privacy algorithms can have an impact on the utility. The objective of this thesis is to improve the performance of FL while maintaining privacy, through techniques like data generalization, feature selection for dimension reduction, and minimizing noise in the anonymization process. This research also investigates separating data based on features instead of records and evaluates the performance of the proposed model using real healthcare data, with the aim of developing a predictive model for healthcare applications.

show abstract

Learning Privately over Distributed Features: An ADMM Sharing Approach

Cited by 7 publications

References 20 publications

Decentralized Non-Convex Learning with Linearly Coupled Constraints

Decentralized Non-Convex Learning with Linearly Coupled Constraints

A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning

Privacy-Preserving Federated Learning Model for Healthcare Data

Contact Info

Product

Resources

About