2019 IEEE International Symposium on Information Theory (ISIT) 2019
DOI: 10.1109/isit.2019.8849392
|View full text |Cite
|
Sign up to set email alerts
|

A Concentration of Measure Approach to Database De-anonymization

Abstract: In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching.Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
30
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 21 publications
(32 citation statements)
references
References 17 publications
2
30
0
Order By: Relevance
“…statistics, we can show users have no privacy iff m = Ω(n 2 r−1 +α ) and a n = O(n − 1 r−1 −β ); however, if the data trace of users is governed by a Markov chain, we can show users have no privacy iff m = Ω(n 2 |E |−r +α ) and a n = O(n − 1 |E |−r −β ). Most of the previous work [19]- [25] that considers intra-user dependency assumes independence between the traces of different users, which is different from our work as described below.…”
Section: Also Definementioning
confidence: 83%
See 1 more Smart Citation
“…statistics, we can show users have no privacy iff m = Ω(n 2 r−1 +α ) and a n = O(n − 1 r−1 −β ); however, if the data trace of users is governed by a Markov chain, we can show users have no privacy iff m = Ω(n 2 |E |−r +α ) and a n = O(n − 1 |E |−r −β ). Most of the previous work [19]- [25] that considers intra-user dependency assumes independence between the traces of different users, which is different from our work as described below.…”
Section: Also Definementioning
confidence: 83%
“…The bulk of previous work assumes independence between the traces of different users. [19]- [25] have mostly considered temporal and spatial dependency within data traces, but not crossuser dependency. In [19], an obfuscation technique is employed to achieve privacy; however, for continuous Location-Based Services (LBS) queries, there is often strong temporal dependency in the locations.…”
Section: Introductionmentioning
confidence: 99%
“…More recently matching of correlated databases have been rigorously investigated in [6] and [7]. In [6], Shirani et al developed a matching scheme based on joint typicality and derived necessary and sufficient conditions on the database growth rate for realiable matching using an extension of Shannon-McMillan-Breiman Theorem and Fano's inequality. In [7], Cullina et…”
Section: Introductionmentioning
confidence: 99%
“…We model the above example as a database matching problem where the goal is to match the corresponding rows across databases such that the probability of mismatch goes to zero as the number of attributes in the database (number of columns) grows to infinity. The two databases are assumed to have the same number of users (rows) and are generated according to a bivariate stochastic process as in [6]. Different than [6], the second database suffers from column deletion.…”
Section: Al Introduced Cycle Mutual Information As Amentioning
confidence: 99%
See 1 more Smart Citation