2011
DOI: 10.14778/2002938.2002939
|View full text |Cite
|
Sign up to set email alerts
|

Recovering semantics of tables on the web

Abstract: The Web offers a corpus of over 100 million tables [6], but the meaning of each table is rarely explicit from the table itself. Header rows exist in few cases and even when they do, the attribute names are typically useless. We describe a system that attempts to recover the semantics of tables by enriching the table with additional annotations. Our annotations facilitate operations such as searching for tables and finding related tables.To recover semantics of tables, we leverage a database of class labels and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
319
1
3

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 258 publications
(325 citation statements)
references
References 25 publications
2
319
1
3
Order By: Relevance
“…In the last several years, an active and inventive group at Google, possibly inspired by Halevy, Norvig, and Pereira [37], collected and analyzed millions of tables harvested from the web [1,38,39]. Visual verification of their results has necessarily been restricted to much smaller samples.…”
Section: Physical Structure Extractionmentioning
confidence: 99%
“…In the last several years, an active and inventive group at Google, possibly inspired by Halevy, Norvig, and Pereira [37], collected and analyzed millions of tables harvested from the web [1,38,39]. Visual verification of their results has necessarily been restricted to much smaller samples.…”
Section: Physical Structure Extractionmentioning
confidence: 99%
“…CSV and HTML tables can be turned into RDF with dedicated tools [14,21]. Larger frameworks, like Open Refine + DERI's RDF plugin [7,20], Opencube [13] On enriching the RDF graphs coming out of the statistical tables, [25] annotates Web tables using class labels and relationships automatically extracted from the Web to augment the semantics and improve access. Integrated HTML tables are used to extend search aggregated results [8,27] and to insert Web table data into word processing software [8].…”
Section: Related Workmentioning
confidence: 99%
“…For this reason, we update these statistics every time we run the conversion workflow. 25 . This allows us to keep track of what is left to map.…”
Section: Linked Dataset Descriptionmentioning
confidence: 99%
“…Semantic annotation of services [21,22] and more recently of web tables [16,18,25] has also received attention. Most of this work learns types for services parameters or table columns, but is limited in learning relationships.…”
Section: Related Workmentioning
confidence: 99%