Stacey Truex scite author profile

Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions.

show abstract

Data Poisoning Attacks Against Federated Learning Systems

Tolpegin

et al. 2020

View full text Add to dashboard Cite

Federated learning (FL) is an emerging paradigm for distributed training of large-scale deep neural networks in which participants' data remains on their own devices with only model updates being shared with a central server. However, the distributed nature of FL gives rise to new threats caused by potentially malicious participants. In this paper, we study targeted data poisoning attacks against FL systems in which a malicious subset of the participants aim to poison the global model by sending model updates derived from mislabeled data. We first demonstrate that such data poisoning attacks can cause substantial drops in classification accuracy and recall, even with a small percentage of malicious participants. We additionally show that the attacks can be targeted, i.e., they have a large negative impact only on classes that are under attack. We also study attack longevity in early/late round training, the impact of malicious participant availability, and the relationships between the two. Finally, we propose a defense strategy that can help identify malicious participants in FL to circumvent poisoning attacks, and demonstrate its effectiveness.

show abstract

Differentially Private Model Publishing for Deep Learning

Liu

et al. 2019

199

176

View full text Add to dashboard Cite

Deep learning techniques based on neural networks have shown significant success in a wide range of AI tasks. Large-scale training datasets are one of the critical factors for their success. However, when the training datasets are crowdsourced from individuals and contain sensitive information, the model parameters may encode private information and bear the risks of privacy leakage. The recent growing trend of the sharing and publishing of pre-trained models further aggravates such privacy risks. To tackle this problem, we propose a differentially private approach for training neural networks. Our approach includes several new techniques for optimizing both privacy loss and model accuracy. We employ a generalization of differential privacy called concentrated differential privacy(CDP), with both a formal and refined privacy loss analysis on two different data batching methods. We implement a dynamic privacy budget allocator over the course of training to improve model accuracy. Extensive experiments demonstrate that our approach effectively improves privacy loss accounting, training efficiency and model quality under a given privacy budget.

show abstract

Demystifying Membership Inference Attacks in Machine Learning as a Service

Truex

Ling

Gürsoy

et al. 2021

IEEE Trans. Serv. Comput.

156

120

View full text Add to dashboard Cite

Membership inference attacks seek to infer membership of individual training instances of a model to which an adversary has black-box access through a machine learning-as-a-service API. In providing an in-depth characterization of membership privacy risks against machine learning models, this paper presents a comprehensive study towards demystifying membership inference attacks from two complimentary perspectives. First, we provide a generalized formulation of the development of a black-box membership inference attack model. Second, we characterize the importance of model choice on model vulnerability through a systematic evaluation of a variety of machine learning models and model combinations using multiple datasets. Through formal analysis and empirical evidence from extensive experimentation, we characterize under what conditions a model may be vulnerable to such black-box membership inference attacks. We show that membership inference vulnerability is data-driven and corresponding attack models are largely transferable. Though different model types display different vulnerabilities to membership inference, so do different datasets. Our empirical results additionally show that (1) using the type of target model under attack within the attack model may not increase attack effectiveness and (2) collaborative learning exposes vulnerabilities to membership inference risks when the adversary is a participant. We also discuss countermeasure and mitigation strategies.

show abstract

Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models Based on Pre-Computation

Cock

Dowsley

Horst

et al. 2019

IEEE Trans. Dependable and Secure Comput.

108

106

View full text Add to dashboard Cite

TiFL: A Tier-based Federated Learning System

et al. 2020

View full text Add to dashboard Cite

A Framework for Evaluating Client Privacy Leakages in Federated Learning

Wei

Liu

Loper

et al. 2020

View full text Add to dashboard Cite

Utility-Aware Synthesis of Differentially Private and Attack-Resilient Location Traces

Gürsoy

Liu

Truex

et al. 2018

View full text Add to dashboard Cite

Differentially private location trace synthesis (DPLTS) has recently emerged as a solution to protect mobile users' privacy while enabling the analysis and sharing of their location traces. A key challenge in DPLTS is to best preserve the utility in location trace datasets, which is non-trivial considering the high dimensionality, complexity and heterogeneity of datasets, as well as the diverse types and notions of utility. In this paper, we present OptaTrace: a utility-optimized and targeted approach to DPLTS. Given a real trace dataset D, the differential privacy parameter ε controlling the strength of privacy protection, and the utility/error metric Err of interest; OptaTrace uses Bayesian optimization to optimize DPLTS such that the output error (measured in terms of given metric Err) is minimized while εdifferential privacy is satisfied. In addition, OptaTrace introduces a utility module that contains several built-in error metrics for utility benchmarking and for choosing Err, as well as a frontend web interface for accessible and interactive DPLTS service. Experiments show that OptaTrace's optimized output can yield substantial utility improvement and error reduction compared to previous work.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Stacey Truex

A Hybrid Approach to Privacy-Preserving Federated Learning

Data Poisoning Attacks Against Federated Learning Systems

Differentially Private Model Publishing for Deep Learning

Demystifying Membership Inference Attacks in Machine Learning as a Service

Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models Based on Pre-Computation

TiFL: A Tier-based Federated Learning System

A Framework for Evaluating Client Privacy Leakages in Federated Learning

Utility-Aware Synthesis of Differentially Private and Attack-Resilient Location Traces

Contact Info

Product

Resources

About