Proceedings of the International Workshop on Semantic Big Data 2017
DOI: 10.1145/3066911.3066914
|View full text |Cite
|
Sign up to set email alerts
|

Extracting linked data from statistic spreadsheets

Abstract: Statistic data is an important sub-category of open data; it is interesting for many applications, including but not limited to data journalism, as such data is typically of high quality, and reflects (under an aggregated form) important aspects of a society's life such as births, immigration, economic output etc. However, such open data is often not published as Linked Open Data (LOD) limiting its usability.We provide a conceptual model for the open data comprised in statistic files published by INSEE, the le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2
1

Relationship

3
3

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 6 publications
0
8
0
Order By: Relevance
“…To carry in our graph all the information from a 2d table, and enable meaningful search results, we adopt the approach in [12] for transforming 2d tables into Linked Open Data (RDF). Specifically, we create a header cell node for each header cell, shown as gray boxes at the bottom of Figure 3, and a value node for each data cell (white boxes in the gure).…”
Section: Figure 3: Conversion Of a 2d Table In A Graphmentioning
confidence: 99%
“…To carry in our graph all the information from a 2d table, and enable meaningful search results, we adopt the approach in [12] for transforming 2d tables into Linked Open Data (RDF). Specifically, we create a header cell node for each header cell, shown as gray boxes at the bottom of Figure 3, and a value node for each data cell (white boxes in the gure).…”
Section: Figure 3: Conversion Of a 2d Table In A Graphmentioning
confidence: 99%
“…We use data from French national institute for statistics and economic studies (INSEE) as an example as highquality, trustful reference database. In previous work, we have extracted tens of thousands of RDF graphs out of INSEE statistic tables [2] 4 . We also developed a novel keyword search algorithm which, given a set of search terms, e.g.…”
Section: Introductionmentioning
confidence: 99%
“…In this work, we describe the last missing step of our system: the extraction of claims referring to statistical mentions from text sources. This step allows to automatically formulate the search queries which our system [3] can solve against the RDF corpus we gathered [2]. Our whole system can help factchecking journalists to find checkable claims in massive text sources, as well as the closest reference datasource value for the given claim.…”
Section: Introductionmentioning
confidence: 99%
“…In a prior work [3], we have devised an approach to extract from high-quality, statistic Open Data in the "tables + text description" frequently used nowadays, Linked Open Data in RDF format; this is a first step toward addressing the above issues. Subsequently, we applied our approach to the complete set of statistics published by INSEE, a leading French national statistics institute, and republish the resulting RDF Open Data (together with the crawling code and the extraction code 1 ).…”
Section: Introductionmentioning
confidence: 99%
“…In the sequel, Section 2 outlines the statistic data sources we consider, their organization after our extraction [3], and our system architecture. Section 3 defines the search problem we address, and describes our search algorithms.…”
Section: Introductionmentioning
confidence: 99%