Open data plays a fundamental role in the 21th century by stimulating economic growth and by enabling more transparent and inclusive societies. However, it is always difficult to create new high-quality datasets with the required privacy guarantees for many use cases. This paper aims at creating a framework for releasing new open data while protecting the individuality of the users through a strict definition of privacy called differential privacy. Unlike previous work, this paper provides a framework for privacy preserving data publishing that can be easily adapted to different use cases, from the generation of time-series to continuous data, and discrete data; no previous work has focused on the later class. Indeed, many use cases expose discrete data or at least a combination between categorical and numerical values. Thanks to the latest developments in deep learning and generative models, it is now possible to model rich-semantic data maintaining both the original distribution of the features and the correlations between them. The output of this framework is a deep network, namely a generator, able to create new data on demand. We demonstrate the efficiency of our approach on real datasets from the French public administration and classic benchmark datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.