Classification is one of the most ubiquitous data mining problems found in real life. Decision tree classification is one of the bestknown solution approaches. This paper describes the construction of a decision tree classifier on vertically partitioned data owned by different owners, by concealing the data held by the parties. Our protocol uses an efficient splitting strategy as well as a semitrusted third party to efficiently build a binary decision tree model. The third party uses a commodity server where the different owners send request and receive commodities (data) from the server, where the commodities are independent of the parties involved in classification. Commodity server assists the parties to conduct the computation for decision tree construction. The security of our classification method is based on scalar product protocol. The goal of secure protocols is to provide privacy preservation, without finding a third party that everyone trusts.
Data is distributed in various sites that need to be mined in a secure manner without revealing anything except the results of mining. This paper converses about privacypreserving horizontal distributed classification techniques where multiple sites collaborate and broadcast the mining results. However in the process, no information about either the data maintained in the sites or data obtained during computation is divulged. We have presented two protocols to construct a Privacy Preserving Naï ve Bayesian classifier using the Pailler's homomorphic encryption techniques. We propose that our approach is more secure and efficient than any of the previous privacy preserving Naï ve Bayesian methods.
In order to extract interesting patterns, data available at multiple sites has to be trained. The data available in these sites should not be revealed while extorting patterns. Distributed Data mining enables sites to mine patterns based on the knowledge available at different sites. In the process of sites collaborating to develop a model, it is extremely important to protect the privacy of data or intermediate results. The features of the data maintained at each site are often similar in nature. In this paper, we design an improved privacypreserving distributed naive Bayesian classifier to train the horizontal data. This trained model is propagated to sites involved in computation to assist classify a new tuple. We further analyze the security and complexity of the algorithm.
Data maintained at various sectors, needs to be mined to derive useful inferences. Larger part of the data is sensitive and not to be revealed while mining. Current methods perform privacy preservation classification either by randomizing, perturbing or anonymizing the data during mining. These forms of privacy preserving mining work well for data centralized at a single site. Moreover the amount of information hidden during mining is not sufficient. When perturbation approaches are used, data reconstruction is a major challenge. This paper aims at modeling classifiers for data distributed across various sites with respect to the same instances. The homomorphic and probabilistic property of Paillier is used to perform secure product, mean and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.