2006
DOI: 10.1007/11669487_15
|View full text |Cite
|
Sign up to set email alerts
|

Notes on Contemporary Table Recognition

Abstract: Abstract. The shift of interest to web tables in HTML and PDF files, coupled with the incorporation of table analysis and conversion routines in commercial desktop document processing software, are likely to turn table recognition into more of a systems than an algorithmic issue. We illustrate the transition by some actual examples of web table conversion. We then suggest that the appropriate target format for table analysis, whether performed by conventional customized programs or by off-theshelf software, is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2006
2006
2012
2012

Publication Types

Select...
6
1

Relationship

4
3

Authors

Journals

citations
Cited by 27 publications
(14 citation statements)
references
References 9 publications
(9 reference statements)
0
14
0
Order By: Relevance
“…Anomalous word-frequency profile checks (like "spare" appearing too often in a document that has an above average number of occurrences of "spore") may be useful as well. Support for the logical markup of tables can be had by integrating a tool we built previously for that purpose [4].…”
Section: Prototypementioning
confidence: 99%
“…Anomalous word-frequency profile checks (like "spare" appearing too often in a document that has an above average number of occurrences of "spore") may be useful as well. Support for the logical markup of tables can be had by integrating a tool we built previously for that purpose [4].…”
Section: Prototypementioning
confidence: 99%
“…Suppose, for instance, that we process the left-hand table in Figure 8 and include it into the ontology. Then when we encounter the right-hand table we hope to be able to learn that the hepth of goldam is 320 gd [26]. Our current plans to build interactive software for harvesting web tables based on the formalisms described above are outlined in Section 5.…”
Section: Labeled Table Candidates For Which Wang Notation Exists Are mentioning
confidence: 99%
“…Again, we would like a format that allows a smooth transition to a database query language. We favor Wang Notation, which provides a layout-independent representation of the relations between hierarchical category headers and content cells [149,150]. For archival circuit diagrams and engineering drawings, the natural choice seems to be one of the widespread CAD formats (like Spice, Synopsis, and AutoCad).…”
Section: Interoperabilitymentioning
confidence: 99%