Membership Inference Attacks Against Machine Learning Models

Shokri, Reza; Stronati, Marco; Song, Congzheng; Shmatikov, Vitaly

doi:10.1109/sp.2017.41

Cited by 2,684 publications

(3,037 citation statements)

References 28 publications

(48 reference statements)

Supporting

Mentioning

2,990

Contrasting

Unclassified

Order By: Relevance

“…C ) Compute per-example gradients of discriminator loss on fake data Z t and clip them 13 for i ∈ Z t do 14 Compute grad d f ake t ← − ∇ θ d d loss f ake(θ d t , Z i ) 15 grad d f ake t = grad d f ake t /max(1, ||grad d f ake ||2 C ) 16 Compute the overall gradients of discriminator and add Gaussian Noise to them 17 grad d t ← − 1 bs grad d real t + grad d f ake t + N (0, σ 2 C 2 I) 18 Take the gradient Descent step for discriminator 19 θ dt+1 ← − SGD(grads d t , θ dt , lr)) / * Update RDP Accountant * / 20 Accumulate the spent privacy budget using RDP Accountant / * Update the Generator Network * / 21 g loss ← − log(1 − D(G(Z t ))) 22 Compute gradients of generator loss 23 Compute grad g t ← − ∇ θg g loss(θ g t , Z i ) 24 Take the gradient Descent step for generator 25 θ g t+1 ← − ADAM (grad g t , θ g t ) 26 if spent epsilon > OR spent delta > δ then The dataset used used in the evaluation is MNIST handwritten dataset containing 60k training samples and 10k test samples. In the experiments, batch size is set to 600, δ = 10 −5 and learning rate is set by an adapative approach in which the initial learning rate is 0.15, it is decreased to 0.052 in iteration 10K and is fixed on 0.052 for the rest iterations.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

DP-CGAN: Differentially Private Synthetic Data and Label Generation

Torkzadehmahani

Kairouz

Paten

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

134

113

View full text Add to dashboard Cite

Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, we introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy, which improves the performance of the model while preserving privacy of the training dataset. DP-CGAN generates both synthetic data and corresponding labels and leverages the recently introduced Rényi differential privacy accountant to track the spent privacy budget. The experimental results show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy.

show abstract

Section: Resultsmentioning

confidence: 99%

“…Some previous studies have proposed approaches to addressing the problem of preserving privacy in Deep Learning. Shokri et al [22] developed a distributed approach in which multiple parties train a model on their local training set independently. Then, each party selects a set of key parameters, and shares them with the other parties.…”

Section: Related Workmentioning

confidence: 99%

DP-CGAN: Differentially Private Synthetic Data and Label Generation

Torkzadehmahani

Kairouz

Paten

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

134

113

View full text Add to dashboard Cite

show abstract

“…Table 4 illustrates the attack results on matrix-factorization-based recommender systems when we weight normal users, where the experimental settings are the same as those in Table 1. Here, "Weighting" means that we weight each normal user and optimize the attack of (30) over the weighted normal users, and the weight of each normal user is computed based on (31). Comparing Tables 1 and 4, we can see that the performance is improved when we consider the weights of different normal users with respect to the target items.…”

Section: Weighting Normal Usersmentioning

confidence: 99%

Influence Function based Data Poisoning Attacks to Top-N Recommender Systems

Fang

Gong

Liu

2020

Proceedings of the Web Conference 2020

105

View full text Add to dashboard Cite

Recommender system is an essential component of web services to engage users. Popular recommender systems model user preferences and item properties using a large amount of crowdsourced user-item interaction data, e.g., rating scores; then top-N items that match the best with a user's preference are recommended to the user. In this work, we show that an attacker can launch a data poisoning attack to a recommender system to make recommendations as the attacker desires via injecting fake users with carefully crafted user-item interaction data. Specifically, an attacker can trick a recommender system to recommend a target item to as many normal users as possible. We focus on matrix factorization based recommender systems because they have been widely deployed in industry. Given the number of fake users the attacker can inject, we formulate the crafting of rating scores for the fake users as an optimization problem. However, this optimization problem is challenging to solve as it is a non-convex integer programming problem. To address the challenge, we develop several techniques to approximately solve the optimization problem. For instance, we leverage influence function to select a subset of normal users who are influential to the recommendations and solve our formulated optimization problem based on these influential users. Our results show that our attacks are effective and outperform existing methods. CCS CONCEPTS• Security and privacy → Web application security. KEYWORDSAdversarial recommender systems, data poisoning attacks, adversarial machine learning.

show abstract

“…Often, an aggregator is a central entity that also redistributes the merged model parameters to all participants but other topologies have been used as well, e.g., co-locating an aggregator with each participant. However, this approach still poses privacy risks: inference attacks in the learning phase have been proposed by [30]; deriving private information from a trained model has been demonstrated in [37]; and a model inversion attack has been presented in [19].…”

Section: Introductionmentioning

confidence: 99%

HybridAlpha

Baracaldo

Zhou

et al. 2019

Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security

161

View full text Add to dashboard Cite

Federated learning has emerged as a promising approach for collaborative and privacy-preserving learning. Participants in a federated learning process cooperatively train a model by exchanging model parameters instead of the actual training data, which they might want to keep private. However, parameter interaction and the resulting model still might disclose information about the training data used. To address these privacy concerns, several approaches have been proposed based on differential privacy and secure multiparty computation (SMC), among others. They often result in large communication overhead and slow training time. In this paper, we propose HybridAlpha, an approach for privacy-preserving federated learning employing an SMC protocol based on functional encryption. This protocol is simple, efficient and resilient to participants dropping out. We evaluate our approach regarding the training time and data volume exchanged using a federated learning process to train a CNN on the MNIST data set. Evaluation against existing crypto-based SMC solutions shows that HybridAlpha can reduce the training time by 68% and data transfer volume by 92% on average while providing the same model performance and privacy guarantees as the existing solutions. CCS CONCEPTS• Security and privacy → Privacy-preserving protocols; • Computing methodologies → Distributed artificial intelligence; Neural networks.

show abstract

Membership Inference Attacks Against Machine Learning Models

Cited by 2,684 publications

References 28 publications

DP-CGAN: Differentially Private Synthetic Data and Label Generation

DP-CGAN: Differentially Private Synthetic Data and Label Generation

Influence Function based Data Poisoning Attacks to Top-N Recommender Systems

HybridAlpha

Contact Info

Product

Resources

About