2021
DOI: 10.48550/arxiv.2106.03408
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Antipodes of Label Differential Privacy: PATE and ALIBI

Abstract: We consider the privacy-preserving machine learning (ML) setting where the trained model must satisfy differential privacy (DP) with respect to the labels of the training examples. We propose two novel approaches based on, respectively, the Laplace mechanism and the PATE framework, and demonstrate their effectiveness on standard benchmarks. While recent work by Ghazi et al. proposed Label DP schemes based on a randomized response mechanism, we argue that additive Laplace noise coupled with Bayesian inference (… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…While GeoPointGAN demonstrably fulfills the requirements of label-LDP, the actual level of privacy provided is higher in many practical settings. As Malek et al [49] note, simply removing the sensitive labels of a public dataset and training a model in an unsupervised fashion, complies with label-(L)DP. We take this further by accounting for situations where, even if we remove the identifier (e.g., taxi ID, 311 caller name), we are still conscious of leaking information based on knowledge regarding the veracity of each point (e.g., if locations can help to identify the caller).…”
Section: Final Remarksmentioning
confidence: 96%
See 2 more Smart Citations
“…While GeoPointGAN demonstrably fulfills the requirements of label-LDP, the actual level of privacy provided is higher in many practical settings. As Malek et al [49] note, simply removing the sensitive labels of a public dataset and training a model in an unsupervised fashion, complies with label-(L)DP. We take this further by accounting for situations where, even if we remove the identifier (e.g., taxi ID, 311 caller name), we are still conscious of leaking information based on knowledge regarding the veracity of each point (e.g., if locations can help to identify the caller).…”
Section: Final Remarksmentioning
confidence: 96%
“…Label DP. Label-DP was formally introduced by Chaudhuri and Hsu [13] and has since been the focus of several studies [24,32,49,62,73]. All of these works are based on the same premise as our work: only the labels attached to data are sensitive, with the data itself being non-sensitive.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Rahman et al [2018] also use membership inference attacks to measure the privacy loss on models trained with differentially private algorithms. The empirical performance of membership inference attacks has also been used to provide lower bounds on the privacy guarantees achieved by various differentially private algorithms , Nasr et al, 2021, Malek et al, 2021. The key difference between the empirical analysis of membership inference in the previous three works and other works is that they simulate the exact adversary in differential privacy i.e., they train multiple models with and without one particular training point and keep the rest of training set fixed.…”
Section: Differential Privacy and Membership Inferencementioning
confidence: 99%
“…Various factors such as the distribution of training data, difference in distributions of the train and test data may provide an over-estimate or under-estimate of the actual privacy risk from the model , Humphries et al, 2020. Theoretical analyses that connect the success of membership inference to privacy risk through the framework of differential privacy avoid this issue by slightly modifying how the attack performance is measured [Yeom et al, 2018, Nasr et al, 2021, Malek et al, 2021. Instead of measuring the leakage from a particular model, these works aim at measuring the worst-case leakage of the training algorithm (for a given model architecture), and construct multiple models with and without one training point and keep the rest of training set fixed.…”
Section: Introductionmentioning
confidence: 99%