A Customised Text Privatisation Mechanism with Differential Privacy

Chen, Huimin; Mo, Fengran; Chen, Cen; Cui, Jamie; Nie, Jian‐Yun

doi:10.48550/arxiv.2207.01193

Cited by 1 publication

(1 citation statement)

References 28 publications

(60 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They sanitize public data before training the model, as they enable the model to work with sanitized queries more effectively, thus enhancing accuracy. Additionally, recent studies have explored novel techniques for safeguarding text data privacy by manipulating the data during the collection process [22][23][24].…”

Section: Privacy-preserving Text Analysis and Collectionmentioning

confidence: 99%

Adapting Geo-Indistinguishability for Privacy-Preserving Collection of Medical Microdata

Song

Kim

2023

Electronics

View full text Add to dashboard Cite

In the era of the Fourth Industrial Revolution, the increasing demand for data collection and sharing for analysis purposes has raised concerns regarding privacy violations. Protecting individual privacy during the collection and dissemination of sensitive information has emerged as a critical concern. In this paper, we propose a privacy-preserving framework for collecting users’ medical microdata, utilizing geo-indistinguishability (Geo-I), a concept based on well-known differential privacy. We adapt Geo-I, originally designed for protecting location information privacy, to collect medical microdata while minimizing the reduction in data utility. To mitigate the reduction in data utility caused by the perturbation mechanism of Geo-I, we propose a novel data perturbation technique that utilizes the prior distribution information of the data being collected. The proposed framework enables the collection of perturbed microdata with a distribution similar to that of the original dataset, even in scenarios that demand high levels of privacy protection, typically requiring significant perturbations to the original data. We evaluate the performance of our proposed algorithms using real-world data and demonstrate that our approach significantly outperforms existing methods, ensuring user privacy while preserving data utility in medical data collection.

show abstract

Section: Privacy-preserving Text Analysis and Collectionmentioning

confidence: 99%