R2RML is used to specify transformations of data avail able in relational databases into materialised or virtual RDF datasets. SPARQL queries evaluated against virtual datasets are translated into SQL queries according to the R2RML mappings, so that they can be evaluated over the underly ing relational database engines. In this paper we describe an extension of a well-known algorithm for SPARQL to SQL translation, originally formalised for RDBMS-backed triple stores, that takes into account R2RML mappings. We present the result of our implementation using queries from a synthetic benchmark and from three real use cases, and show that SPARQL queries can be in general evaluated as fast as the SQL queries that would have been generated by SQL experts if no R2RML mappings had been used.
Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets (e.g., relational databases, CSV and JSON files), either by materializing integrated data into RDF or by performing on-the-fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented (e.g., referential integrity among sources, datatypes, or data integrity); thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with the GTFS-Madrid benchmark; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of Morph-CSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.
We present the process that has been followed for the development of an application that makes use of several heterogeneous Spanish public datasets that are related to administrative, hydrographic, and statistical domains. Our application aims at analysing existing relations between the Spanish coastal area and different statistical variables such as unemployment, population, dwelling, industry, and building trade. Moreover, we provide an important innovation with respect to other similar processes followed in other initiatives by dealing with the geometrical information of features.
Knowledge graphs are often generated using rules that apply semantic annotations to data sources. Software tools then execute these rules and generate or virtualize the corresponding RDF-based knowledge graph. RML is an extension of the W3C-recommended R2RML language, extending support from relational databases to other data sources, such as data in CSV, XML, and JSON format. As part of the R2RML standardization process, a set of test cases was created to assess tool conformance the specification. In this work, we generated an initial set of reusable test cases to assess RML conformance. These test cases are based on R2RML test cases and can be used by any tool, regardless of the programming language. We tested the conformance of two RML processors: the RMLMapper and CARML. The results show that the RMLMapper passes all CSV, XML, and JSON test cases, and most test cases for relational databases. CARML passes most CSV, XML, and JSON test cases regarding. Developers can determine the degree of conformance of their tools, and users determine based on conformance results to determine the most suitable tool for their use cases.
In the last decade, REST has become the most common approach to provide web services, yet it was not originally designed to handle typical modern applications (e.g. mobile apps). GraphQL was proposed to reduce the number of queries and data exchanged in comparison with REST. Since its release in 2015, it has gained momentum as an alternative approach to REST. However, generating and maintaining GraphQL resolvers is not a simple task. First, a domain expert has to analyze a dataset, design the corresponding GraphQL schema and map the dataset to the schema. Then, a software engineer (e.g. GraphQL developer) implements the corresponding GraphQL resolvers in a specific programming language. In this paper, we present an approach to exploit the information from mappings rules (relation between target and source schema) and generate a GraphQL server. These mapping rules construct a virtual knowledge graph which is accessed by the generated GraphQL resolvers. These resolvers translate the input GraphQL queries into the queries supported by the underlying dataset. Domain experts or software developers may benefit from our approach: a domain expert does not need to involve software developers to implement the resolvers, and software developers can generate the initial version of the resolvers to be implemented. We implemented our approach in the Morph-GraphQL framework and evaluated it using the LinGBM benchmark.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.