Min Lyu scite author profile

The Sparse Vector Technique (SVT) is a fundamental technique for satisfying differential privacy and has the unique quality that one can output some query answers without apparently paying any privacy cost. SVT has been used in both the interactive setting, where one tries to answer a sequence of queries that are not known ahead of the time, and in the non-interactive setting, where all queries are known. Because of the potential savings on privacy budget, many variants for SVT have been proposed and employed in privacypreserving data mining and publishing. However, most variants of SVT are actually not private. In this paper, we analyze these errors and identify the misunderstandings that likely contribute to them. We also propose a new version of SVT that provides better utility, and introduce an effective technique to improve the performance of SVT. These enhancements can be applied to improve utility in the interactive setting. Through both analytical and experimental comparisons, we show that, in the non-interactive setting (but not the interactive setting), the SVT technique is unnecessary, as it can be replaced by the Exponential Mechanism (EM) with better accuracy.

show abstract

Publishing Graph Degree Distribution with Node Differential Privacy

Day

Lyu

2016

131

109

View full text Add to dashboard Cite

Graph data publishing under node-differential privacy (node-DP) is challenging due to the huge sensitivity of queries. However, since a node in graph data oftentimes represents a person, node-DP is necessary to achieve personal data protection. In this paper, we investigate the problem of publishing the degree distribution of a graph under node-DP by exploring the projection approach to reduce the sensitivity. We propose two approaches based on aggregation and cumulative histogram to publish the degree distribution. The experiments demonstrate that our approaches greatly reduce the error of approximating the true degree distribution and have significant improvement over existing works. We also present the introspective analysis for understanding the factors of publishing the degree distribution with node-DP.

show abstract

Differential Privacy: From Theory to Practice

Lyu

et al. 2016

Synthesis Lectures on Information Security, Privacy, and Trust

105

View full text Add to dashboard Cite

Differentially Private K-Means Clustering and a Hybrid Approach to Private Optimization

Cao

et al. 2017

ACM Trans. Priv. Secur.

View full text Add to dashboard Cite

k -means clustering is a widely used clustering analysis technique in machine learning. In this article, we study the problem of differentially private k -means clustering. Several state-of-the-art methods follow the single-workload approach, which adapts an existing machine-learning algorithm by making each step private. However, most of them do not have satisfactory empirical performance. In this work, we develop techniques to analyze the empirical error behaviors of one of the state-of-the-art single-workload approaches, DPLloyd, which is a differentially private version of the Lloyd algorithm for k >-means clustering. Based on the analysis, we propose an improvement of DPLloyd. We also propose a new algorithm for k -means clustering from the perspective of the noninteractive approach, which publishes a synopsis of the input dataset and then runs k -means on synthetic data generated from the synopsis. We denote this approach by EUGkM. After analyzing the empirical error behaviors of EUGkM, we further propose a hybrid approach that combines our DPLloyd improvement and EUGkM. Results from extensive and systematic experiments support our analysis and demonstrate the effectiveness of the DPLloyd improvement, EUGkM, and the hybrid approach.

show abstract

Understanding the Sparse Vector Technique for Differential Privacy

Lyu

2016

Preprint

View full text Add to dashboard Cite

PrivPfC: differentially private data publication for classification

Cao

et al. 2018

The VLDB Journal

View full text Add to dashboard Cite

A Data Layout and Fast Failure Recovery Scheme for Distributed Storage Systems with Mixed Erasure Codes

Xu¹,

Lyu²,

Li³

et al. 2021

IEEE Trans. Comput.

View full text Add to dashboard Cite

HCFTL: A Locality-Aware Page-Level Flash Translation Layer

Chen

Lyu

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Min Lyu

Understanding the sparse vector technique for differential privacy

Publishing Graph Degree Distribution with Node Differential Privacy

Differential Privacy: From Theory to Practice

Differentially Private K-Means Clustering and a Hybrid Approach to Private Optimization

Understanding the Sparse Vector Technique for Differential Privacy

PrivPfC: differentially private data publication for classification

A Data Layout and Fast Failure Recovery Scheme for Distributed Storage Systems with Mixed Erasure Codes

HCFTL: A Locality-Aware Page-Level Flash Translation Layer

Contact Info

Product

Resources

About