“…In the instance matching track of OAEI-2010 4 , the participants for IIMB2010 large dataset were Combinatorial Optimization for Data Integration (CODI) [20], Automated Semantic Mapping of Ontologies with Validation (ASMOV) [35] and RiMOM [19]. We experimented with our core instance matching system without property weight, we called as AFlood(PW-), and our augmented instance matching system with proposed automatic weight factor, we call as AFlood(PW+).…”
Section: Results Of Our Proposed System With Iimb-2010 Data Setmentioning
The proliferation of heterogeneous data sources of semantic knowledge base intensifies the need of an automatic instance matching technique. However, the efficiency of instance matching is often influenced by the weight of a property associated to instances. Automatic weight generation is a non-trivial, however an important task in instance matching technique. Therefore, identifying an appropriate metric for generating weight for a property automatically is nevertheless a formidable task. In this paper, we investigate an approach of generating weights automatically by considering hypotheses: (1) the weight of a property is directly proportional to the ratio of the number of its distinct values to the number of instances contain the property, and (2) the weight is also proportional to the ratio of the number of distinct values of a property to the number of instances in a training dataset. The basic intuition behind the use of our approach is the classical theory of information content that infrequent words
“…In the instance matching track of OAEI-2010 4 , the participants for IIMB2010 large dataset were Combinatorial Optimization for Data Integration (CODI) [20], Automated Semantic Mapping of Ontologies with Validation (ASMOV) [35] and RiMOM [19]. We experimented with our core instance matching system without property weight, we called as AFlood(PW-), and our augmented instance matching system with proposed automatic weight factor, we call as AFlood(PW+).…”
Section: Results Of Our Proposed System With Iimb-2010 Data Setmentioning
The proliferation of heterogeneous data sources of semantic knowledge base intensifies the need of an automatic instance matching technique. However, the efficiency of instance matching is often influenced by the weight of a property associated to instances. Automatic weight generation is a non-trivial, however an important task in instance matching technique. Therefore, identifying an appropriate metric for generating weight for a property automatically is nevertheless a formidable task. In this paper, we investigate an approach of generating weights automatically by considering hypotheses: (1) the weight of a property is directly proportional to the ratio of the number of its distinct values to the number of instances contain the property, and (2) the weight is also proportional to the ratio of the number of distinct values of a property to the number of instances in a training dataset. The basic intuition behind the use of our approach is the classical theory of information content that infrequent words
“…5 systems participated to the evaluation: DSSim [24], Ri-MOM [25], OKKAM [15], HMatch [26], and ASMOV [27]. In this first instance matching track, 4 systems out of 5 represented generic ontology matching tools, which included instance matching as a part of their functionality, while only one (OKKAM) was specifically aimed at resolving data level coreferences.…”
“…Many approaches were originally proposed as ontology matchers. Examples include RiMOM, LogMap, Asmov and ObjectCoref (Li, Tang, Li & Luo, 2009;Jiménez-Ruiz & Grau, 2011;Jean-Mary, Shironoshita & Kabuka, 2010;Hu, Qu & Sun, 2011). New systems continue to be proposed each year as part of the annual Ontology Alignment Evaluation Initiative 52 (Ferrara, Nikolov, Noessner & Scharffe, 2013).…”
Resource Description Framework (RDF) is a graph-based data model used to publish data as a
Web of Linked Data
(Bizer
et al
. 2009). RDF is an emergent foundation for large-scale
data integration
, the problem of providing a unified view over multiple data sources. The structure in RDF data can be conveniently visualized using
directed labeled graphs
, as illustrated in the real-world graph fragments in Figure 1. Nodes in the graph represent entities (e.g. the node with label
dbpedia:Allen_, Paul
represents the entity Paul Allen in the DBpedia knowledge graph) and edges represent either attributes of an entity (e.g. '01/21/1953' is the birthdate of Paul Allen) or relationships between two entities (e.g. Paul Allen is the co-founder of the company entity, Microsoft). Facts in the knowledge base are formally represented as a set of
triples
, with a triple comprising a labeled edge (denoted as a
property
) in the RDF graph along with its incoming and outgoing nodes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.