Abstract:k-Anonymity by microaggregation is one of the most commonly used anonymization techniques. This success is owe to the achievement of a worth of interest tradeoff between information loss and identity disclosure risk. However, this method may have some drawbacks. On the disclosure limitation side, there is a lack of protection against attribute disclosure. On the data utility side, dealing with a real datasets is a challenging task to achieve. Indeed, the latter are characterized by their large number of attrib… Show more
“…Shi et al suggested to use distance metrics and information entropy to aggregate data into equivalent groups, thereby, ensuring the protection of individual privacy while minimizing information loss 19 . B. Abidi et al introduced a microaggregation method based on fuzzy possibilistic clustering 21 which proposes to study the distribution of confidential attributes within each sub-dataset and the privacy parameter K is determined by preserving the diversity of confidential attributes within the anonymized microdata. The microaggregation method proposed by Rodríguez-Hoyos et al employed linear discriminant analysis to build microcells 22 .…”
The rapid development of the mobile Internet coupled with the widespread use of intelligent terminals have intensified the digitization of personal information and accelerated the evolution of the era of big data. The sharing and publishing of various big data brings convenience and also increases the risk of personal privacy leakage. In order to reduce users’ privacy leakage that may be caused by data release, many privacy preserving data publishing methods have been proposed by scientists in both academia and industry in the recent years. However, non-numerical sensitive information has natural semantic relevance, and therefore, synonymous linkages may still exist and cause serious privacy disclosures in privacy protection methods based on an anonymous model. To address this issue, this paper proposes a privacy preserving dynamic data publishing method based on microaggregation. A series of indicators are accordingly designed to evaluate the synonymous linkages between the non-numerical sensitive values which in turn facilitate in improving the clustering effect of the microaggregation anonymous method. The dynamic update program is introduced into the proposed microaggregation method to realize the dynamic release and update of data. Experimental analysis suggests that the proposed method provides better privacy protection effect and availability of published data in contrast to the state-of-the-art methods.
“…Shi et al suggested to use distance metrics and information entropy to aggregate data into equivalent groups, thereby, ensuring the protection of individual privacy while minimizing information loss 19 . B. Abidi et al introduced a microaggregation method based on fuzzy possibilistic clustering 21 which proposes to study the distribution of confidential attributes within each sub-dataset and the privacy parameter K is determined by preserving the diversity of confidential attributes within the anonymized microdata. The microaggregation method proposed by Rodríguez-Hoyos et al employed linear discriminant analysis to build microcells 22 .…”
The rapid development of the mobile Internet coupled with the widespread use of intelligent terminals have intensified the digitization of personal information and accelerated the evolution of the era of big data. The sharing and publishing of various big data brings convenience and also increases the risk of personal privacy leakage. In order to reduce users’ privacy leakage that may be caused by data release, many privacy preserving data publishing methods have been proposed by scientists in both academia and industry in the recent years. However, non-numerical sensitive information has natural semantic relevance, and therefore, synonymous linkages may still exist and cause serious privacy disclosures in privacy protection methods based on an anonymous model. To address this issue, this paper proposes a privacy preserving dynamic data publishing method based on microaggregation. A series of indicators are accordingly designed to evaluate the synonymous linkages between the non-numerical sensitive values which in turn facilitate in improving the clustering effect of the microaggregation anonymous method. The dynamic update program is introduced into the proposed microaggregation method to realize the dynamic release and update of data. Experimental analysis suggests that the proposed method provides better privacy protection effect and availability of published data in contrast to the state-of-the-art methods.
“…Distance metrics and information entropy were used to aggregate data into equivalence groups, which ensure the protection of individual privacy while minimizing information loss. Abidi et al [25] introduced a new microaggregation method based on fuzzy possibilistic clustering, which proposes to study the distribution of confidential attributes within each sub-dataset and the privacy parameter K is determined by preserving the diversity of confidential attributes within the anonymized microdata. Ana et al [26] proposed a k-anonymous microaggregation method via linear discriminant analysis.…”
The rapid development of the mobile Internet coupled with the widespread use of intelligent terminals have intensifified the digitization of personal information and accelerated the evolution of the era of big data. The sharing and publishing of various big data brings convenience and also increases the risk of personal privacy leakage. In order to reduce users’ privacy leakage that may be caused by data release, many privacy preserving data publishing methods have been proposed by scientists in both academic and industry in the recent years. However, non-numerical sensitive information has natural semantic relevance,and therefore, synonymous linkages may still exist and cause serious privacy disclosures in privacy protection methods based on an anonymous model. To address this issue, this paper proposes a privacy preserving dynamic data publishing method based on micro aggregation. A series of indicators are accordingly designed to evaluate the synonymous linkages between the non-numerical sensitive values which in turn facilitate in improving the clustering effect of the micro-aggregation anonymous method. The dynamic update program is introduced into the proposed micro-aggregation method to realize the dynamic release and update of data. Experimental analysis suggests that the proposed method provides better privacy protection effect and availability of published data in contrast to the state-of-the-art methods.
“…Micro-aggregation can be applied to both continuous and categorical data without the need for the data author to create generalised categories. Various approaches to perform micro-aggregation were proposed; for example, a hybrid micro-aggregation approach which is "based on fuzzy possibilistic clustering" [24].…”
Section: Generic Privacy-preserving Data Transformation Approachesmentioning
Process mining has been successfully applied in the healthcare domain and has helped to uncover various insights for improving healthcare processes. While the benefits of process mining are widely acknowledged, many people rightfully have concerns about irresponsible uses of personal data. Healthcare information systems contain highly sensitive information and healthcare regulations often require protection of data privacy. The need to comply with strict privacy requirements may result in a decreased data utility for analysis. Until recently, data privacy issues did not get much attention in the process mining community; however, several privacy-preserving data transformation techniques have been proposed in the data mining community. Many similarities between data mining and process mining exist, but there are key differences that make privacy-preserving data mining techniques unsuitable to anonymise process data (without adaptations). In this article, we analyse data privacy and utility requirements for healthcare process data and assess the suitability of privacy-preserving data transformation methods to anonymise healthcare data. We demonstrate how some of these anonymisation methods affect various process mining results using three publicly available healthcare event logs. We describe a framework for privacy-preserving process mining that can support healthcare process mining analyses. We also advocate the recording of privacy metadata to capture information about privacy-preserving transformations performed on an event log.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.