2023
DOI: 10.1109/tbdata.2022.3207521
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Distributed Data Anonymization for Large Datasets

Abstract: k-Anonymity and -diversity are two well-known privacy metrics that guarantee protection of the respondents of a dataset by obfuscating information that can disclose their identities and sensitive information. Existing solutions for enforcing them implicitly assume to operate in a centralized scenario, since they require complete visibility over the dataset to be anonymized, and can therefore have limited applicability in anonymizing large datasets. In this paper, we propose a solution that extends Mondrian (an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…The H-PGPkAA algorithm showed good acceleration with slightly equal information loss as that of the GkAA algorithm. Additionally, we believe that the data reorganization used in these algorithms can be applied to improve data utility while achieving k-anonymity in other distributed parallel algorithms employing horizontal data partitioning, including the GCCG algorithm [10] or the extended Mondrian algorithm [31].…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…The H-PGPkAA algorithm showed good acceleration with slightly equal information loss as that of the GkAA algorithm. Additionally, we believe that the data reorganization used in these algorithms can be applied to improve data utility while achieving k-anonymity in other distributed parallel algorithms employing horizontal data partitioning, including the GCCG algorithm [10] or the extended Mondrian algorithm [31].…”
Section: Discussionmentioning
confidence: 99%
“…There has been extensive research conducted on distributed privacy preserving data publishing over the last twenty years [27]- [31]. Within this domain, the primary challenges revolve around data partitioning and aggregation strategies.…”
Section: A Related Workmentioning
confidence: 99%
See 3 more Smart Citations