Wendy Hui Wang scite author profile

Machine learning is powerful to model massive genomic data while genome privacy is a growing concern. Studies have shown that not only the raw data but also the trained model can potentially infringe genome privacy. An example is the membership inference attack (MIA), by which the adversary, who only queries a given target model without knowing its internal parameters, can determine whether a specific record was included in the training dataset of the target model. Differential privacy (DP) has been used to defend against MIA with rigorous privacy guarantee. In this paper, we investigate the vulnerability of machine learning against MIA on genomic data, and evaluate the effectiveness of using DP as a defense mechanism. We consider two widely-used machine learning models, namely Lasso and convolutional neural network (CNN), as the target model. We study the trade-off between the defense power against MIA and the prediction accuracy of the target model under various privacy settings of DP. Our results show that the relationship between the privacy budget and target model accuracy can be modeled as a log-like curve, thus smaller privacy budget provides stronger privacy guarantee with the cost of losing more model accuracy. We also investigate the effect of model sparsity on model vulnerability against MIA. Our results demonstrate that in addition to prevent overfitting, model sparsity can work together with DP to significantly mitigate the risk of MIA.

show abstract

PAR-GAN: Improving the Generalization of Generative Adversarial Networks Against Membership Inference Attacks

Chen

Wang

Gao

et al. 2021

View full text Add to dashboard Cite

Answering XML queries using materialized views revisited

Theodoratos

Wang

2009

View full text Add to dashboard Cite

Answering queries using views is a well-established technique in databases. In this context, two outstanding problems can be formulated. The first one consists in deciding whether a query can be answered exclusively using one or multiple materialized views. Given the many alternative ways to compute the query from the materialized views, the second problem consists in finding the best way to compute the query from the materialized views. In the realm of XML, there is a restricted number of contributions in the direction of these problems due to the many limitations associated with the use of materialized views in traditional XML query evaluation models. In this paper, we adopt a recent evaluation model, called inverted lists model, and holistic algorithms which together have been established as the prominent technique for evaluating queries on large persistent XML data, and we address the previous two problems. This new context revises these problems since it requires new conditions for view usability and new techniques for computing queries from materialized views. We suggest an original approach for materializing views which stores for every view node only the list of XML nodes necessary for computing the answer of the view. We specify necessary and sufficient conditions for answering a treepattern query using one or multiple materialized views in terms of homomorphisms from the views to the query. In order to efficiently answer queries using materialized views, we design a stack-based algorithm which compactly encodes in polynomial time and space all the homomorphisms from a view to a query. We further propose space and time optimizations by using bitmaps to encode view materializations and by employing bitwise operations to minimize the evaluation cost of the queries. Finally, we conducted an extensive experimentation which demonstrates that our approach yields impressive query hit rates in the view pool, achieves significant time and space savings and shows smooth scalability.

show abstract

Integrity Verification of Outsourced Frequent Itemset Mining with Deterministic Guarantee

Dong

Liu

Wang

2013

View full text Add to dashboard Cite

Automating Fairness Configurations for Machine Learning

Sun

Yang

et al. 2021

View full text Add to dashboard Cite

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wendy Hui Wang

Algorithmic fairness in computational medicine

Privacy-Preserving Distributed Movement Data Aggregation

Towards Fair Truth Discovery from Biased Crowdsourced Answers

Differential Privacy Protection Against Membership Inference Attack on Machine Learning for Genomic Data

PAR-GAN: Improving the Generalization of Generative Adversarial Networks Against Membership Inference Attacks

Answering XML queries using materialized views revisited

Integrity Verification of Outsourced Frequent Itemset Mining with Deterministic Guarantee

Automating Fairness Configurations for Machine Learning

Contact Info

Product

Resources

About