2011 IEEE 27th International Conference on Data Engineering 2011
DOI: 10.1109/icde.2011.5767868
|View full text |Cite
|
Sign up to set email alerts
|

Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins

Abstract: Accurate cardinality estimates are essential for a successful query optimization. This is not only true for relational DBMSs but also for RDF stores. An RDF database consists of a set of triples and, hence, can be seen as a relational database with a single table with three attributes. This makes RDF rather special in that queries typically contain many self joins.We show that relational DBMSs are not well-prepared to perform cardinality estimation in this context. Further, there are hardly any special cardina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
199
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 167 publications
(206 citation statements)
references
References 7 publications
1
199
0
Order By: Relevance
“…The first query is used for identifying "characteristic sets" [13]: frequently co-occurring properties with a subject. The second identifies all the properties used in the dataset and sorts them according to their frequency.…”
Section: Methodsmentioning
confidence: 99%
“…The first query is used for identifying "characteristic sets" [13]: frequently co-occurring properties with a subject. The second identifies all the properties used in the dataset and sorts them according to their frequency.…”
Section: Methodsmentioning
confidence: 99%
“…We obtain a more compact schema than [10], by using the TF/IDF (Term Frequency/Inverted Document Frequency) measure from information retrieval [16] to detect discriminative properties, and using semantic information to merge similar CS's. Further, a schema graph of CS's is created by analyzing the co-reference relationship statistics between CS's.…”
Section: Emergent Schemasmentioning
confidence: 99%
“…This was observed in the proposal to make SPARQL query optimization more reliable by recognizing "characteristics sets" [10]. A characteristic set is a combination of properties that typically co-occur with the same subject.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…However, the lack of a central schema causes a series of difficulties in the consumption of such data (e.g., [9,11,1,14]), e.g., having two different population numbers in the same KB. For instance, data users and knowledge engineers need an understanding of what information is available in order to write queries, and to reuse or engineer KBs [15,26]. In data management, cardinality is an important aspect of the structure of data.…”
Section: Introductionmentioning
confidence: 99%