Links build the backbone of the Linked Data Cloud. With the steady growth in size of datasets comes an increased need for end users to know which frameworks to use for deriving links between datasets. In this survey, we comparatively evaluate current Link Discovery tools and frameworks. For this purpose, we outline general requirements and derive a generic architecture of Link Discovery frameworks. Based on this generic architecture, we study and compare the features of state-ofthe-art linking frameworks. We also analyze reported performance evaluations for the different frameworks. Finally, we derive insights pertaining to possible future developments in the domain of Link Discovery.
Life science ontologies evolve frequently to meet new requirements or to better reflect the current domain knowledge. The development and adaptation of large and complex ontologies is typically performed collaboratively by several curators. To effectively manage the evolution of ontologies it is essential to identify the difference (Diff) between ontology versions. Such a Diff supports the synchronization of changes in collaborative curation, the adaptation of dependent data such as annotations, and ontology version management. We propose a novel approach COnto-Diff to determine an expressive and invertible diff evolution mapping between given versions of an ontology. Our approach first matches the ontology versions and determines an initial evolution mapping consisting of basic change operations (insert/update/delete). To semantically enrich the evolution mapping we adopt a rule-based approach to transform the basic change operations into a smaller set of more complex change operations, such as merge, split, or changes of entire subgraphs. The proposed algorithm is customizable in different ways to meet the requirements of diverse ontologies and application scenarios. We evaluate the proposed approach for large life science ontologies including the Gene Ontology and the NCI Thesaurus and compare it with PromptDiff. We further show how the Diff results can be used for version management and annotation migration in collaborative curation.
BackgroundOntologies are increasingly used to structure and semantically describe entities of domains, such as genes and proteins in life sciences. Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data.ResultsWe present GOMMA, a generic infrastructure for managing and analyzing life science ontologies and their evolution. GOMMA utilizes a generic repository to uniformly and efficiently manage ontology versions and different kinds of mappings. Furthermore, it provides components for ontology matching, and determining evolutionary ontology changes. These components are used by analysis tools, such as the Ontology Evolution Explorer (OnEX) and the detection of unstable ontology regions. We introduce the component-based infrastructure and show analysis results for selected components and life science applications. GOMMA is available at http://dbs.uni-leipzig.de/GOMMA.ConclusionsGOMMA provides a comprehensive and scalable infrastructure to manage large life science ontologies and analyze their evolution. Key functions include a generic storage of ontology versions and mappings, support for ontology matching and determining ontology changes. The supported features for analyzing ontology changes are helpful to assess their impact on ontology-dependent applications such as for term enrichment. GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.
Abstract. The continuous evolution of life science ontologies requires the adaptation of their associated mappings. We propose two approaches for tackling this problem in a largely automatic way: (1) a compositionbased adaptation relying on the principle of mapping composition and (2) a diff-based adaptation algorithm individually handling change operations to update the mapping. Both techniques reuse unaffected correspondences, and adapt only the affected mapping part. We experimentally assess and confirm the effectiveness of our approaches for evolving mappings between large life science ontologies.
Abstract. Ontologies are heavily developed and used in life sciences and undergo continuous changes. However, the evolution of life science ontologies and references to them (e.g., annotations) is not well understood and has received little attention so far. We therefore propose a generic framework for analyzing both the evolution of ontologies and the evolution of ontology-related mappings, in particular annotations referring to ontologies and similarity (match) mappings between ontologies. We use our framework for an extensive comparative evaluation of evolution measures for 16 life science ontologies. Moreover, we analyze the evolution of annotation mappings and ontology mappings for the Gene Ontology.
Abstract. Matching life science ontologies to determine ontology mappings has recently become an active field of research. The large size of existing ontologies and the application of complex match strategies for obtaining high quality mappings makes ontology matching a resource-and time-intensive process. To improve performance we investigate different approaches for parallel matching on multiple compute nodes. In particular, we consider inter-matcher and intramatcher parallelism as well as the parallel execution of element-and structurelevel matching. We implemented a distributed infrastructure for parallel ontology matching and evaluate different approaches for parallel matching of large life science ontologies in the field of anatomy and molecular biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.