Ontologies are at the heart of knowledge management and make use of information that is not only written in English but also in many other natural languages. In order to enable knowledge discovery, sharing and reuse of these multilingual ontologies, it is necessary to support ontology mapping despite natural language barriers. This paper examines the soundness of a generic approach that involves machine translation tools and monolingual ontology matching techniques in cross-lingual ontology mapping scenarios. In particular, experimental results collected from case studies which engage mappings of independent ontologies that are labeled in English and Chinese are presented. Based on findings derived from these studies, limitations of this generic approach are discussed. It is shown with evidence that appropriate translations of conceptual labels in ontologies are of crucial importance when applying monolingual matching techniques in cross-lingual ontology mapping. Finally, to address the identified challenges, a semantic-oriented cross-lingual ontology mapping (SOCOM) framework is proposed and discussed.
This paper describes a semi-automated process, framework and tools for harvesting, assessing, improving and maintaining high-quality linked-data. The framework, known as DaCura 1 , provides dataset curators, who may not be knowledge engineers, with tools to collect and curate evolving linked data datasets that maintain quality over time. The framework encompasses a novel process, workflow and architecture. A working implementation has been produced and applied firstly to the publication of an existing social-sciences dataset, then to the harvesting and curation of a related dataset from an unstructured data-source. The framework's performance is evaluated using data quality measures that have been developed to measure existing published datasets. An analysis of the framework against these dimensions demonstrates that it addresses a broad range of real-world data quality concerns. Experimental results quantify the impact of the DaCura process and tools on data quality through an assessment framework and methodology which combines automated and human data quality controls.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.