The volume of RDF data continues to grow over the past decade and many known RDF datasets have billions of triples. A grant challenge of managing this huge RDF data is how to access this big RDF data efficiently. A popular approach to addressing the problem is to build a full set of permutations of (S, P, O) indexes. Although this approach has shown to accelerate joins by orders of magnitude, the large space overhead limits the scalability of this approach and makes it heavyweight. In this paper, we present TripleBit, a fast and compact system for storing and accessing RDF data. The design of TripleBit has three salient features. First, the compact design of TripleBit reduces both the size of stored RDF data and the size of its indexes. Second, TripleBit introduces two auxiliary index structures, ID-Chunk bit matrix and ID-Predicate bit matrix, to minimize the cost of index selection during query evaluation. Third, its query processor dynamically generates an optimal execution ordering for join queries, leading to fast query execution and effective reduction on the size of intermediate results. Our experiments show that TripleBit outperforms RDF-3X, MonetDB, BitMat on LUBM, UniProt and BTC 2012 benchmark queries and it offers orders of mangnitude performance improvement for some complex join queries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.