In privacy preserving data mining, the -diversity and -anonymity models are the most widely used for preserving the sensitive private information of an individual. Out of these two, -diversity model gives better privacy and lesser information loss as compared to the -anonymity model. In addition, we observe that numerous clustering algorithms have been proposed in data mining, namely, -means, PSO, ACO, and BFO. Amongst them, the BFO algorithm is more stable and faster as compared to all others exceptmeans. However, BFO algorithm suffers from poor convergence behavior as compared to other optimization algorithms. We also observed that the current literature lacks any approaches that apply BFO with -diversity model to realize privacy preservation in data mining. Motivated by this observation, we propose here an approach that uses fractional calculus (FC) in the chemotaxis step of the BFO algorithm. The FC is used to boost the computational performance of the algorithm. We also evaluate our proposed FC-BFO and BFO algorithms empirically, focusing on information loss and execution time as vital metrics. The experimental evaluation shows that our proposed FC-BFO algorithm derives an optimal cluster as compared to the original BFO algorithm and existing clustering algorithms.
In medical organizations large amount of personal data are collected and analyzed by the data miner or researcher, for further perusal. However, the data collected may contain sensitive information such as specific disease of a patient and should be kept confidential. Hence, the analysis of such data must ensure due checks that ensure protection against threats to the individual privacy. In this context, greater emphasis has now been given to the privacy preservation algorithms in data mining research. One of the approaches is anonymization approach that is able to protect private information; however, valuable information can be lost. Therefore, the main challenge is how to minimize the information loss during an anonymization process. The proposed method is grouping similar data together based on sensitive attribute and then anonymizes them. Our experimental results show the proposed method offers better outcomes with respect to information loss and execution time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.