On Feeding Business Systems with Linked Resources from the Web of Data

Cimmino, Andrea; Corchuelo, Rafael

doi:10.1007/978-3-319-93931-5_22

Cited by 3 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Enriching covers a large number of possible tasks [69], from transforming RDF data in order to increase its quality, e.g., removing white spaces or capitalising names, up to creating new data on the fly, e.g., completing the RDF data of a KG with machine learning [1]. Linking aims at producing links between the RDF resources by means of link rules [15]. Additionally, these links may involve RDF resources from different KGs; creating these links is one of the Linked Data principles [3].…”

Section: Requirements Of a Knowledge Graph Life Cyclementioning

confidence: 99%

“…RDF link discovery aims at producing relationships between local RDF resources and other RDF resources allocated in different KGs [58]. On the one hand, there is a wide number of tools that aim at producing link rules [16][17][18]23,43,44,60,61,75], i.e., restrictions under which two RDF resources are linked. On the other hand, other tools focus on applying those rules efficiently and producing the links among resources [26,59,79].…”

Section: Knowledge Graph Curationmentioning

confidence: 99%

“…The RDF Generator Module has several internal translators in order to understand different mapping languages, like RML, 15 the WoT-Mappings, or the JSON serialisation 16 of the model depicted by Fig. 4.…”

Section: Conceptual Mappingsmentioning

confidence: 99%

“…For this reason, the RDF Generator Module counts with a dynamic system for loading plugins. 17 This allows users to develop new data providers or handlers without modifying the code of Helio, and load these extensions dynamically.…”

Section: Extending the Rdf Generator Modulementioning

confidence: 99%

See 3 more Smart Citations

Helio: A framework for implementing the life cycle of knowledge graphs

Cimmino

García‐Castro

2024

Self Cite

View full text Add to dashboard Cite

Building and publishing knowledge graphs (KG) as Linked Data, either on the Web or in private companies, has become a relevant and crucial process in many domains. This process requires that users perform a wide number of tasks conforming to the life cycle of a KG, and these tasks usually involve different unrelated research topics, such as RDF materialisation or link discovery. There is already a large corpus of tools and methods designed to perform these tasks; however, the lack of one tool that gathers them all leads practitioners to develop ad-hoc pipelines that are not generic and, thus, non-reusable. As a result, building and publishing a KG is becoming a complex and resource-consuming process. In this paper, a generic framework called Helio is presented. The framework aims to cover a set of requirements elicited from the KG life cycle and provide a tool capable of performing the different tasks required to build and publish KGs. As a result, Helio aims at providing users with the means for reducing the effort required to perform this process and, also, Helio aims to prevent the development of ad-hoc pipelines. Furthermore, the Helio framework has been applied in many different contexts, from European projects to research work.

show abstract

Section: Requirements Of a Knowledge Graph Life Cyclementioning

confidence: 99%

Section: Knowledge Graph Curationmentioning

confidence: 99%

Section: Conceptual Mappingsmentioning

confidence: 99%

Section: Extending the Rdf Generator Modulementioning

confidence: 99%

See 2 more Smart Citations

Helio: A framework for implementing the life cycle of knowledge graphs

Cimmino

García‐Castro

2024

Self Cite

View full text Add to dashboard Cite

show abstract

“…Before feeding the record sets returned by data extraction into a particular application, it is commonly necessary to perform some of the following integration tasks: semantisation [25,45,54,55,60,63,71], which either maps the descriptors onto the terminology box of a particular ontology or the tuples onto its assertion box [19]; union [23], which merges record sets that provide similar data; finding primary keys [62], which determines which components of the tuples identify them as univocally as possible; record linkage [8,11,12], which finds different records that refer to the same actual entities; augmentation [6,52,67], which joins record sets on the same topic to complete the information that they provide individually; and cleaning [10,31,61], which fixes data. Note that the integration tasks are orthogonal to data extraction because they are independent from the source of the record sets, which is the reason why they fall out of the scope of this article.…”

Section: Data-extraction Vocabularymentioning

confidence: 99%

On extracting data from tables that are encoded using HTML

Roldán

Jiménez

Corchuelo

2020

Knowledge-Based Systems

Self Cite

View full text Add to dashboard Cite

Tables are a common means to display data in human-friendly formats. Many authors have worked on proposals to extract those data back since this has many interesting applications. In this article, we summarise and compare many of the proposals to extract data from tables that are encoded using HTML and have been published between 2000 and 2018. We first present a vocabulary that homogenises the terminology used in this field; next, we use it to summarise the proposals; finally, we compare them side by side. Our analysis highlights several challenges to which no proposal provides a conclusive solution and a few more that have not been addressed sufficiently; simply put, no proposal provides a complete solution to the problem, which seems to suggest that this research field shall keep active in the near future. We have also realised that there is no consensus regarding the datasets and the methods used to evaluate the proposals, which hampers comparing the experimental results.

show abstract

On learning context-aware rules to link RDF datasets

Cimmino

Corchuelo

2020

Logic Journal of the IGPL

Self Cite

View full text Add to dashboard Cite

Integrating RDF datasets has become a relevant problem for both researchers and practitioners. In the literature, there are many genetic proposals that learn rules that allow to link the resources that refer to the same real-world entities, which is paramount to integrating the datasets. Unfortunately, they are context-unaware because they focus on the resources and their attributes but forget about their neighbours. This implies that they fall short in cases in which different resources have similar attributes but refer to different real-world entities or cases in which they have dissimilar attributes but refer to the same real-world entities. In this article, we present a proposal that learns context-aware rules that take into account both the attributes of the resources and their neighbours. We have conducted an extensive experimentation that proves that it outperforms the most advanced genetic proposal. Our conclusions were checked using statistically sound methods.

show abstract

On Feeding Business Systems with Linked Resources from the Web of Data

Cited by 3 publications

References 19 publications

Helio: A framework for implementing the life cycle of knowledge graphs

Helio: A framework for implementing the life cycle of knowledge graphs

On extracting data from tables that are encoded using HTML

On learning context-aware rules to link RDF datasets

Contact Info

Product

Resources

About