The anonymization of health data streams is important to protect these data against potential privacy breaches. A large number of research studies aiming at offering privacy in the context of data streams has been recently conducted. However, the techniques that have been proposed in these studies generate a significant delay during the anonymization process, since they concentrate on applying existing privacy models (e.g., k-anonymity and l-diversity) to batches of data extracted from data streams in a period of time. In this paper, we present delay-free anonymization, a framework for preserving the privacy of electronic health data streams. Unlike existing works, our method does not generate an accumulation delay, since input streams are anonymized immediately with counterfeit values. We further devise late validation for increasing the data utility of the anonymization results and managing the counterfeit values. Through experiments, we show the efficiency and effectiveness of the proposed method for the real-time release of data streams.
RDF is a data model for representing labeled directed graphs, and it is used as an important building block of semantic web. Due to its flexibility and applicability, RDF has been used in applications, such as semantic web, bioinformatics, and social networks. In these applications, large-scale graph datasets are very common. However, existing techniques are not effectively managing them. In this paper, we present a scalable, efficient query processing system for RDF data, named SPIDER, based on the well-known parallel/distributed computing framework, Hadoop. SPIDER consists of two major modules (1) the graph data loader, (2) the graph query processor. The loader analyzes and dissects the RDF data and places parts of data over multiple servers. The query processor parses the user query and distributes sub queries to cluster nodes. Also, the results of sub queries from multiple servers are gathered (and refined if necessary) and delivered to the user. Both modules utilize the MapReduce framework of Hadoop. In addition, our system supports some features of SPARQL query language. This prototype will be foundation to develop real applications with large-scale RDF graph data.
SUMMARYRecently, social network services are rapidly growing and this trend is expected to continue in the future. Social network data can be published for various purposes such as statistical analysis and population studies. When social network data are published, however, the privacy of some people may be disclosed. The most straightforward manner to preserve privacy in social network data is to remove the identifiers of persons from the social network data. However, an adversary can infer the identity of a person in the social network by using his/her background knowledge, which consists of content information such as the age, sex, or address of the person and structural information such as the number of persons having a relationship with the person. In this paper, we propose a privacy protection method for social network data. The proposed method anonymizes social network data to prevent privacy attacks that use both content and structural information, while minimizing the information loss or distortion of the anonymized social network data. Through extensive experiments, we verify the effectiveness and applicability of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.