Part 3: Policy and StakeholdersInternational audienceData-driven innovation has great potential for the development of innovative services that not only have economic value, but that help to address societal challenges. Many of these challenges can only be addressed by data sharing of public and privately owned data. These public-private data sharing collaborations require data governance rules. Data governance can address many barriers, for example by deploying a decision model to guide choices regarding data sharing resulting in interventions supported by a data sharing platform. Based on a literature review of data governance and three use cases for data sharing in the logistics sector, we have developed a data sharing decision model from the perspective of a data provider. The decision model addresses technical as well as ownership, privacy, and economical barriers to sharing publicly and privately owned data and subsequently proposes interventions to address these barriers. We found that the decision model is useful for identifying and addressing data sharing barriers as it is applicable to amongst others privacy and commercial sensitive data
Ontology alignment is widely-used to find the correspondences between different ontologies in diverse fields. A er discovering the alignments, several performance scores are available to evaluate them. e scores typically require the identified alignment and a reference containing the underlying actual correspondences of the given ontologies. e current trend in the alignment evaluation is to put forward a new score (e.g., precision, weighted precision, semantic precision, etc.) and to compare various alignments by juxtaposing the obtained scores. However, it is substantially provocative to select one measure among others for comparison. On top of that, claiming if one system has a be er performance than one another cannot be substantiated solely by comparing two scalars. In this paper, we propose the statistical procedures which enable us to theoretically favor one system over one another. e McNemar's test is the statistical means by which the comparison of two ontology alignment systems over one matching task is drawn. e test applies to a 2 × 2 contingency table which can be constructed in two different ways based on the alignments, each of which has their own merits/pitfalls. e ways of the contingency table construction and various apposite statistics from the McNemar's test are elaborated in minute detail. In the case of having more than two alignment systems for comparison, the family-wise error rate is expected to happen. us, the ways of preventing such an error are also discussed. A directed graph visualizes the outcome of the McNemar's test in the presence of multiple alignment systems. From this graph, it is readily understood if one system is be er than one another or if their differences are imperceptible. e proposed statistical methodologies are applied to the systems participated in the OAEI 2016 anatomy track, and also compares several well-known similarity metrics for the same matching problem.Additional Key Words and Phrases: ontology alignment; McNemar's test; family-wise error rate; anatomy; OAEI; ACM Reference format: Majid Mohammadi, Amir Ahooye Atashin, Wout Hofman, and Yaohua Tan. 2017. Comparison of ontology alignment systems across single matching task via the McNemar's test.
Ontology matching systems are typically compared by comparing their average performances over multiple datasets. However, this paper examines the alignment systems using statistical inference since averaging is statistically unsafe and inappropriate. The statistical tests for comparison of two or multiple alignment systems are theoretically and empirically reviewed. For comparison of two systems, the Wilcoxon signed-rank and McNemar's mid-p and asymptotic tests are recommended due to their robustness and statistical safety in different circumstances. The Friedman and Quade tests with their corresponding post-hoc procedures are studied for comparison of multiple systems, and their [dis]advantages are discussed. The statistical methods are then applied to benchmark and multifarm tracks from the ontology matching evaluation initiative (OAEI) 2015 and their results are reported and visualized by critical difference diagrams.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.