Resource Description Framework (RDF) is a graph-based data model used to publish data as a
Web of Linked Data
(Bizer
et al
. 2009). RDF is an emergent foundation for large-scale
data integration
, the problem of providing a unified view over multiple data sources. The structure in RDF data can be conveniently visualized using
directed labeled graphs
, as illustrated in the real-world graph fragments in Figure 1. Nodes in the graph represent entities (e.g. the node with label
dbpedia:Allen_, Paul
represents the entity Paul Allen in the DBpedia knowledge graph) and edges represent either attributes of an entity (e.g. '01/21/1953' is the birthdate of Paul Allen) or relationships between two entities (e.g. Paul Allen is the co-founder of the company entity, Microsoft). Facts in the knowledge base are formally represented as a set of
triples
, with a triple comprising a labeled edge (denoted as a
property
) in the RDF graph along with its incoming and outgoing nodes.