Membership Inference Attacks From First Principles

Carlini, Nicholas; Chien, Steve; Nasr, Milad; Song, Shuang; Terzis, Andreas; Tramèr, Florian

doi:10.1109/sp46214.2022.9833649

Cited by 102 publications

(81 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The conclusions are further analytically and empirically generalized to non-linear feature extractors. Then, we empirically validate that models trained on DC-synthesized data are robust to both vanilla loss-based MIA and the state-of-the-art likelihood-based MIA (Carlini et al, 2022). Finally, we study the visual privacy of DC-synthesized data in case of adversary's direct matching attack.…”

Section: Introductionmentioning

confidence: 87%

“…For LiRA (Carlini et al, 2022), we repeat the preparation of synthetic dataset N m times with different random seeds, and obtain N m shadow T , S and f S . We set N m = 256 for DM and N m = 64 for KIP because of its lower training efficiency.…”

Section: Methodsmentioning

confidence: 99%

“…Likelihood-based MIA. Recent works (Carlini et al, 2022;Rezaei & Liu, 2021) 2 that are not trained on it (called OUT models). The adversary then measures the means µ in , µ out and the variances σ 2 in , σ 2 out of model confidence for IN and OUT models, respectively.…”

Section: Membership Privacymentioning

confidence: 99%

“…However, as we demonstrate in our experiments (Section 5.2), the membership of data used for DC initialization can still be inferred by vanilla loss-based MIA. One countermeasure is to choose hard-to-infer samples (Carlini et al, 2022), i.e., samples whose model outputs are not affected by the membership, as initialization data.…”

Section: S|mentioning

confidence: 99%

“…LiRA is a more powerful MIA because it can achieve higher TPR at low FPR (Carlini et al, 2022), while the adversary's computational cost is higher. Figure 2 provides the ROC curves of LiRA against f S .…”

Section: Membership Privacy Of Fmentioning

confidence: 99%

See 4 more Smart Citations

Privacy for Free: How does Dataset Condensation Help Privacy?

Tian¹,

Zhao²,

Lyu³

2022

Preprint

View full text Add to dashboard Cite

To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor generalization performance. Therefore, we raise the question whether training efficiency and privacy can be achieved simultaneously. In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free. To demonstrate the privacy benefit of DC, we build a connection between DC and differential privacy, and theoretically prove on linear feature extractors (and then extended to non-linear feature extractors) that the existence of one sample has limited impact (O(m/n)) on the parameter distribution of networks trained on m samples synthesized from n(n m) raw samples by DC. We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihoodbased membership inference attacks. We envision this work as a milestone for data-efficient and privacy-preserving machine learning.

show abstract

Section: Introductionmentioning

confidence: 87%

Section: Methodsmentioning

confidence: 99%

Section: Membership Privacymentioning

confidence: 99%

Section: S|mentioning

confidence: 99%

Section: Membership Privacy Of Fmentioning

confidence: 99%

See 3 more Smart Citations

Privacy for Free: How does Dataset Condensation Help Privacy?

Tian¹,

Zhao²,

Lyu³

2022

Preprint

View full text Add to dashboard Cite

show abstract

A comprehensive survey of personal knowledge graphs

Chakraborty

Sanyal

2023

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Information that can encapsulate a person's daily life and its different aspects provides insightful knowledge. This knowledge can prove to be more useful than general knowledge for improving personalized tasks. When it comes to storing such knowledge, personal knowledge graphs (PKGs) come in as handy saviors. PKGs are knowledge graphs which store details that are pertinent to a user but not, in general, useful to the rest of humanity. Conversational agents can access these PKGs to answer queries related to the user's day‐to‐day life, whereas recommender systems can harness the knowledge stored in PKGs to make personalized suggestions. Despite the immense applicability of PKGs, there has not been significant research in this area. We present an extensive review of PKGs. We categorize them according to the domains in which they are most relevant; in particular, we highlight the use of PKGs in medicine, finance, and education and research. We also categorize the different ways of constructing a PKG based on the source of data required for such constructions. Furthermore, we discuss the limitations of PKGs and suggest directions for future work.This article is categorized under: Fundamental Concepts of Data and Knowledge > Human Centricity and User Interaction Fundamental Concepts of Data and Knowledge > Knowledge Representation Technologies > Artificial Intelligence

show abstract