Abstract. With a growing number of ontologies used in the semantic web, agents can fully make sense of different datasets only if correspondences between those ontologies are known. Ontology matching tools have been proposed to find such correspondences. While the current research focus is mainly on fully automatic matching tools, some approaches have been proposed that involve the user in the matching process. However, there are currently no benchmarks and test methods to compare such tools. In this paper, we introduce a number of quality measures for interactive ontology matching tools, and we discuss means to automatically run benchmark tests for such tools. To demonstrate how those evaluation can be designed, we show examples on assessing the quality of interactive matching tools which involve the user in matcher selection and matcher parametrization.
In this paper, we present the Ontology Matching EvaLuation Toolkit (MELT), a software toolkit to facilitate ontology matcher development, configuration, evaluation, and packaging. Compared to existing tools in the ontology matching domain, our framework offers detailed evaluation capabilities on the correspondence level of alignments as well as extensive group evaluation possibilities. A particular focus is put on a streamlined development and evaluation process along with ease of use for matcher developers and evaluators. Our contributions are twofold: We present an open source matching toolkit that integrates well into existing platforms, as well as an exemplary analysis of two OAEI 2018 tracks demonstrating advantages and analytical capabilities of MELT.
Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6 M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted. In a second series of experiments, we show how the approach can be transferred to DBkWik, a knowledge graph extracted from thousands of Wikis. We discuss the challenges and first results of extracting relations from a larger set of Wikis, using a less formalized knowledge graph.
Open information extraction (OIE) is the task of extracting relations and their corresponding arguments from natural language text in unsupervised manner. Outputs of such systems are used for downstream tasks such as question answering and automatic knowledge base (KB) construction. Many of these downstream tasks rely on aligning OIE triples with reference KBs. Such alignments are usually evaluated w.r.t. a specific downstream task and, to date, no direct manual evaluation of such alignments has been performed. In this paper, we directly evaluate how OIE triples from the OPIEC corpus are related to the DBpedia KB w.r.t. information content. First, we investigate OPIEC triples and DBpedia facts having the same arguments by comparing the information on the OIE surface relation with the KB relation. Second, we evaluate the expressibility of general OPIEC triples in DBpedia. We investigate whether-and, if so, how-a given OIE triple can be mapped to a single KB fact. We found that such mappings are not always possible because the information in the OIE triples tends to be more specific. Our evaluation suggests, however, that significant part of OIE triples can be expressed by means of KB formulas instead of individual facts.
SciGraph is a Linked Open Data graph published by Springer Nature which contains information about conferences and conference publications. In this paper, we discuss how this dataset can be utilized to build a conference recommendation system, yielding a recall@10 of up to 0.665, and a MAP of up to 0.540, generating recommendations based on authors, abstracts, and keywords. Furthermore, we show how the dataset can be linked to WikiCFP to recommend upcoming conferences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.