Abstract. Secure multi-party computation platforms are becoming more and more practical. This has paved the way for privacy-preserving statistical analysis using secure multi-party computation. Simple statistical analysis functions have been emerging here and there in literature, but no comprehensive system has been compiled. We describe and implement the most used statistical analysis functions in the privacy-preserving setting including simple statistics, t-test, χ 2 test, Wilcoxon tests and linear regression. We give descriptions of the privacy-preserving algorithms and benchmark results that show the feasibility of our solution.
We describe the use of secure multi-party computation for performing a large-scale privacypreserving statistical study on real government data. In 2015, statisticians from the Estonian Center of Applied Research (CentAR) conducted a big data study to look for correlations between working during university studies and failing to graduate in time. The study was conducted by linking the database of individual tax payments from the Estonian Tax and Customs Board and the database of higher education events from the Ministry of Education and Research. Data collection, preparation and analysis were conducted using the Sharemind secure multi-party computation system that provided end-to-end cryptographic protection to the analysis. Using ten million tax records and half a million education records in the analysis, this is the largest cryptographically private statistical study ever conducted on real data.
We improve the quality of cryptographically privacy-preserving genome-wide association studies by correctly handling population stratification-the inherent genetic difference of patient groups, e.g., people with different ancestries. Our approach is to use principal component analysis to reduce the dimensionality of the problem so that we get less spurious correlations between traits of interest and certain positions in the genome. While this approach is commonplace in practical genomic analysis, it has not been used within a privacy-preserving setting. In this paper, we use cryptographically secure multi-party computation to tackle principal component analysis, and present an implementation and experimental results showing the performance of the approach.
We present ZK S C, a domain-specific language for zero-knowledge proofs. We present the rationale for its design, its syntax and semantics, and demonstrate its usefulness on the basis of a number of non-trivial examples. The design features a type system, where each piece of data is assigned both a confidentiality and an integrity type, which are not orthogonal to each other. We perform an empiric evaluation of the statements produced by its compiler in terms of their size. We also show the integration of the compiler with the implementation of a zero-knowledge proof technique, and evaluate the running time of both Prover and Verifier.
In bioinformatics, genome-wide association studies (GWAS) are used to detect associations between single-nucleotide polymorphisms (SNPs) and phenotypic traits such as diseases. Significant differences in SNP counts between case and control groups can signal association between variants and phenotypic traits. Most traits are affected by multiple genetic locations. To detect these subtle associations, bioinformaticians need access to more heterogeneous data. Regulatory restrictions in cross-border health data exchange have created a surge in research on privacy-preserving solutions, including secure computing techniques. However, in studies of such scale, one must account for population stratification, as under- and over-representation of sub-populations can lead to spurious associations. We improve on the state of the art of privacy-preserving GWAS methods by showing how to adapt principal component analysis (PCA) with stratification control (EIGENSTRAT), FastPCA, EMMAX and the genomic control algorithm for secure computing. We implement these methods using secure computing techniques—secure multi-party computation (MPC) and trusted execution environments (TEE). Our algorithms are the most complex ones at this scale implemented with MPC. We present performance benchmarks and a security and feasibility trade-off discussion for both techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.