Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop 1 to store and retrieve large number of RDF 2 triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL 3 query. We make use of Hadoop's MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.
The Semantic Web is gaining immense popularityand with it, the Resource Description Framework (RDF) broadly used to model Semantic Web content. However, access control on RDF stores used for single machines has been seldom discussed in the literature. One significant obstacle to using RDF stores defined for single machines is their scalability. Cloud computers, on the other hand, have proven useful for storing large RDF stores; but these systems lack access control on RDF data to our knowledge.This work proposes a token-based access control system that is being implemented in Hadoop (an open source cloud computing framework). It defines six types of access levels and an enforcement strategy for the resulting access control policies. The enforcement strategy is implemented at three levels: Query Rewriting, Embedded Enforcement, and Postprocessing Enforcement. In Embedded Enforcement, policies are enforced during data selection using MapReduce, whereas in Post-processing Enforcement they are enforced during the presentation of data to users. Experiments show that Embedded Enforcement consistently outperforms Postprocessing Enforcement due to the reduced number of jobs required.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.