2012
DOI: 10.14778/2428536.2428539
|View full text |Cite
|
Sign up to set email alerts
|

On differentially private frequent itemset mining

Abstract: We consider differentially private frequent itemset mining. We begin by exploring the theoretical difficulty of simultaneously providing good utility and good privacy in this task. While our analysis proves that in general this is very difficult, it leaves a glimmer of hope in that our proof of difficulty relies on the existence of long transactions (that is, transactions containing many items). Accordingly, we investigate an approach that begins by truncating long transactions, trading off errors introduced b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
116
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 88 publications
(116 citation statements)
references
References 22 publications
0
116
0
Order By: Relevance
“…Roth and Roughgarden [30] propose the median mechanism for an interactive setting, where the adversary requests statistics about D multiple times in an adaptive fashion. There are also works that focus on other specialized scenarios, such as range-count queries [32,8], query consistency [19], sparse data [9,23], correlated queries [18,21], frequent itemsets [33,22], trajectories [20], minimization of relative error [31], and non-numerical query answers [26]. PINQ [25] is a system implementation that integrates differential privacy with data analysis.…”
Section: Differential Privacymentioning
confidence: 99%
“…Roth and Roughgarden [30] propose the median mechanism for an interactive setting, where the adversary requests statistics about D multiple times in an adaptive fashion. There are also works that focus on other specialized scenarios, such as range-count queries [32,8], query consistency [19], sparse data [9,23], correlated queries [18,21], frequent itemsets [33,22], trajectories [20], minimization of relative error [31], and non-numerical query answers [26]. PINQ [25] is a system implementation that integrates differential privacy with data analysis.…”
Section: Differential Privacymentioning
confidence: 99%
“…In the data mining community, the pattern mining problem is often associated with patterns in the form of itemsets. Despite the wide range of algorithmic solutions to this problem, only a few approaches [4,14,24] study the mining of frequent patterns with differential privacy. However, due to the nature of the patterns, these approaches are not suitable in our setting.…”
Section: Related Workmentioning
confidence: 99%
“…This privacy framework provides strong and provable guarantees of privacy and it has become the de facto standard for research in data privacy. Only recently, several techniques [4,14,24] have been proposed for mining frequent patterns under this privacy model. Although these solutions have been shown to be effective in some scenarios, the pattern model used is in the form of itemset which makes these approaches unsuitable for capturing the sequentiality of the events in the data.…”
Section: Introductionmentioning
confidence: 99%
“…The PrivBasis approach proposed in [16] introduces a new mining technique based on the concept of basis set which allows to efficiently and effectively construct the set of frequent itemsets from a small set of short patterns. Recently, Zeng et al [25] pointed out the hardness of the differentially private frequent itemsets mining problem by investigating the trade-off between privacy and utility. As a negative result the authors showed that in order to achieve certain level of utility and privacy guarantee we could incur an extremely high privacy cost due to the presence of long transactions that increase the sensitivity of the count query for the itemsets.…”
Section: Existing Work For Frequent Itemset Miningmentioning
confidence: 99%
“…In this setting, only few works have been proposed to mine frequent patterns [3,16,25]. Although these techniques have been shown effective in certain scenarios, they have the following limitations.…”
Section: Introductionmentioning
confidence: 99%