2019
DOI: 10.48550/arxiv.1911.09356
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Schemaless Queries over Document Tables with Dependencies

Abstract: Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts to an appropriate schema. This can be an expensive process. In this paper we show that by using semantic technologies (RD-F/SPARQL and database dependencies) paired with a simple but powerful way to transform tables with non-relational layouts, it is possible to offer query a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 10 publications
0
1
0
Order By: Relevance
“…The tabular data format is commonly used in digital documents such as PDFs and HTMLs to store semistructured information (Canim et al, 2019; Zhang and * Work done while author was working at IBM. Balog, 2018;Pasupat and Liang, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…The tabular data format is commonly used in digital documents such as PDFs and HTMLs to store semistructured information (Canim et al, 2019; Zhang and * Work done while author was working at IBM. Balog, 2018;Pasupat and Liang, 2015).…”
Section: Introductionmentioning
confidence: 99%