Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy 2020
DOI: 10.1145/3374664.3375722
|View full text |Cite
|
Sign up to set email alerts
|

A Baseline for Attribute Disclosure Risk in Synthetic Data

Abstract: The generation of synthetic data is widely considered as viable method for alleviating privacy concerns and for reducing identification and attribute disclosure risk in micro-data. The records in a synthetic dataset are artificially created and thus do not directly relate to individuals in the original data in terms of a 1-to-1 correspondence. As a result, inferences about said individuals appear to be infeasible and, simultaneously, the utility of the data may be kept at a high level. In this paper, we challe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 27 publications
(23 citation statements)
references
References 14 publications
0
18
0
Order By: Relevance
“…Attribute Disclosure. This kind of privacy violation happens whenever access to data allows an attacker to learn new information about a specific individual [10], e.g., the value of a particular attribute like race, age, income, etc. Unfortunately, if the real data contains strong correlations between attributes, these correlations will likely be replicated in the synthetic data and available to the adversary [11].…”
Section: Risks Of Using Synthetic Datamentioning
confidence: 99%
“…Attribute Disclosure. This kind of privacy violation happens whenever access to data allows an attacker to learn new information about a specific individual [10], e.g., the value of a particular attribute like race, age, income, etc. Unfortunately, if the real data contains strong correlations between attributes, these correlations will likely be replicated in the synthetic data and available to the adversary [11].…”
Section: Risks Of Using Synthetic Datamentioning
confidence: 99%
“…The construction of synthetic datasets and their utility metrics have become an exciting research problem . Further exploration of this avenue also compared the protection provided by fake data against conventional methods like k-anonymization (Hittmeir et al 2020). Recent findings showed that synthetic datasets having similar statistical properties as real data may offer privacy protection against inference attacks.…”
Section: Literature Surveymentioning
confidence: 99%
“…However, there are a growing number of technical definitions and assessments of privacy being introduced, that serve practitioners well to make the legal case. Two commonly used concepts within the context of synthetic data are empirical attribute disclosure assessments ( Taub et al, 2018 ; Hittmeir et al, 2020 ), and Differential Privacy ( Dwork et al, 2006 ). Both of these have proven to be useful in establishing trust in the safety of synthetic data, yet come with their own challenges in practice.…”
Section: Related Workmentioning
confidence: 99%