Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and analysed. Indeed, cluster analysis is one of the data mining tasks that aims to discover patterns and knowledge through different algorithmic techniques such as k-means. Nevertheless, running k-means over distributed big data stores has given rise to serious privacy issues. Accordingly, many proposed works attempted to tackle this concern using cryptographic protocols. However, these cryptographic solutions introduced performance degradation issues in analysis tasks which does not meet big data properties. In this work we propose a novel privacy-preserving k-means algorithm based on a simple yet secure and efficient multiparty additive scheme that is cryptography-free. We designed our solution for horizontally partitioned data. Moreover, we demonstrate that our scheme resists against adversaries passive model.
Human activity recognition (HAR) is an important research field that relies on sensing technologies to enable many context-aware applications. Nevertheless, tracking personal signs to enable such applications has given rise to serious privacy issues, especially when using external activity recognition services. In this paper, we propose (Π-Knn): a privacy-preserving version of the K Nearest Neighbors (k-NN) classifier that is mainly built on (Π-CSP+): a novel cryptography-free private similarity evaluation protocol. As a sample application, we consider a medical monitoring system enhanced with a HAR process based on our privacy preserving classifier. The integration of the privacy preserving HAR aims to improve the accuracy of the clinical decision support. We conduct a standard security analysis to prove that our protocols provide a complete privacy protection against malicious adversaries. We perform a comparative performance evaluation through several experiments while using real HAR system parameters. Experimental evaluations show that our protocol (Π-CSP+) incurs a low increasing overhead (37% in Online classification and 50% in Offline classification) compared to PCSC, representative state-of-the art protocol, which incurs 3600% and 4800% in online and offline classification respectively. Besides, Π-CSP+ provides a stable and efficient response time (W =0.0x m.seconds) for both short and long duration activities while serving up to 1000 clients. Comparative results confirm the computational efficiency of our protocol against a competitive state-of-the-art protocol.
Big data systems are gathering more and more information in order to discover new values through data analytics and depth insights. However, mining sensitive personal information breaches privacy and degrades services' reputation. Accordingly, many research works have been proposed to address the privacy issues of data analytics, but almost seem to be not suitable in big data context either in data types they support or in computation time efficiency. In this paper we propose a novel privacy-preserving cosine similarity computation protocol that will support both binary and numerical data types within an efficient computation time, and we prove its adequacy for big data high volume, high variety and high velocity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.