One-to-one correspondences are not always sufficient to accurately align ontologies, and instead complex correspondences with conditions and transformations may be required. Correspondence patterns provide models which can be used to guide the process of developing complex correspondences. However, it is necessary to first identify which pattern to apply to a given alignment problem. This PhD proposes the development of algorithms, methods and processes for refining elementary correspondences between concepts or relations into complex ones by identifying which correspondence pattern best represents a given correspondence. To date an evaluation of a system to refine correspondences between classes in the YAGO and DBpedia ontologies has been completed. This evaluation showed that for a subsumption correspondence, a training set of 30 instances of the class being mapped was sufficient to refine the match to a conditional one 89% of the time. Hence we have shown that this is a promising approach for correspondences with a conditional element, and correspondences with a translation element will be examined next.
Linked Open Data consists of a large set of structured data knowledge bases which have been linked together, typically using equivalence statements. These equivalences usually take the form of owl:sameAs statements linking individuals, but links between classes are far less common. Often, the lack of linking between classes is because the relationships cannot be described as elementary one to one equivalences. Instead, complex correspondences referencing multiple entities in logical combinations are often necessary if we want to describe how the classes in one ontology are related to classes in a second ontology. In this paper the authors introduce a novel Bayesian Restriction Class Correspondence Estimation (Bayes-ReCCE) algorithm, an extensional approach to detecting complex correspondences between classes. Bayes-ReCCE operates by analysing features of matched individuals in the knowledge bases, and uses Bayesian inference to search for complex correspondences between the classes these individuals belong to. Bayes-ReCCE is designed to be capable of providing meaningful results even when only small amounts of matched instances are available. They demonstrate this capability empirically, showing that the complex correspondences generated by Bayes-ReCCE have a median F1 score of over 0.75 when compared against a gold standard set of complex correspondences between Linked Open Data knowledge bases covering the geographical and cinema domains. In addition, the authors discuss how metadata produced by Bayes-ReCCE can be included in the correspondences to encourage reuse by allowing users to make more informed decisions on the meaning of the relationship described in the correspondences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.