2014
DOI: 10.1007/978-3-319-11964-9_14
|View full text |Cite
|
Sign up to set email alerts
|

LOD Laundromat: A Uniform Way of Publishing Other People’s Dirty Data

Abstract: It is widely accepted that proper data publishing is difficult. The majority of Linked Open Data (LOD) does not meet even a core set of data publishing guidelines. Moreover, datasets that are clean at creation, can get stains over time. As a result, the LOD cloud now contains a high level of dirty data that is difficult for humans to clean and for machines to process.Existing solutions for cleaning data (standards, guidelines, tools) are targeted towards human data creators, who can (and do) choose not to use … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
114
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 115 publications
(115 citation statements)
references
References 10 publications
0
114
0
1
Order By: Relevance
“…HDT has been widely adopted by the community, (i) used as the main backend of Triple Pattern Fragments (TPF) [18] interface, which alleviates the traditional burden of LOD servers by moving part of the query processing onto clients, (ii) used as a storage backend for large-scale graph data [16], or (iii) as the store behind LOD Laundromat [3], serving a crawl of a very big subset of the LOD Cloud, to name but a few.…”
Section: Introductionmentioning
confidence: 99%
“…HDT has been widely adopted by the community, (i) used as the main backend of Triple Pattern Fragments (TPF) [18] interface, which alleviates the traditional burden of LOD servers by moving part of the query processing onto clients, (ii) used as a storage backend for large-scale graph data [16], or (iii) as the store behind LOD Laundromat [3], serving a crawl of a very big subset of the LOD Cloud, to name but a few.…”
Section: Introductionmentioning
confidence: 99%
“…ird, LOD Laundromat [2] crawls, cleans and republishes more than 650K LOD datasets, collected through popular data catalogs like Datahub, as HDT datasets, serving a TPF endpoint for each of them on a single server, which would not be possible with the more expressive SPARQL endpoint interface. Although this has signi cantly reduced the cost of Linked Data publishing and consumption, data scientists who wish to run large-scale analyses need to query many TPF endpoints and integrate the results.…”
Section: Introductionmentioning
confidence: 99%
“…At the heart of Linked Data lies RDF, the Resource Description Framework (RDF) 4 , which is a data model for expressing metadata about any resource. The Linked Open Data Cloud [6] and LOD Laundromat [1] consist of thousands of millions of such metadata statements. With a focus on what these RDF statements are about, we can distinguish between two types of Linked Data datasets.…”
Section: Introductionmentioning
confidence: 99%