Table spotting and structural analysis are just a small fraction of tasks relevant when speaking of table analysis. Today, quite a large number of different approaches facing these tasks have been described in literature or are available as part of commercial OCR systems that claim to deal with tables on the scanned documents and to treat them accordingly.However, the problem of detecting tables is not yet solved at all. Different approaches have different strengths and weak points. Some fail in certain situations or layouts where others perform better. How shall one know, which approach or system is the best for his specific job? The answer to this question raises the demand for an objective comparison of different approaches which address the same task of spotting tables and recognizing their structure. This paper describes our approach towards establishing a complete and publicly available, hence open environment for the benchmarking of table spotting and structural analysis. We provide free access to the ground truthing tool and evaluation mechanism described in this paper, describe the ideas behind and we also provide ground truth for the 547 documents of the UNLV and UW-3 datasets that contain tables.In addition, we applied the quality measures to the results that were generated by the T-Recs system which we developed some years ago and which we started to further advance since a few months.
Wearable eye trackers open up a large number of opportunities to cater for the information needs of users in today's dynamic society. Users no longer have to sit in front of a traditional desk-mounted eye tracker to benefit from the direct feedback given by the eye tracker about users' interest. Instead, eye tracking can be used as a ubiquitous interface in a real-world environment to provide users with supporting information that they need. This paper presents a novel application of intelligent interaction with the environment by combining eye tracking technology with real-time object recognition. In this context we present i) algorithms for guiding object recognition by using fixation points ii) algorithms for generating evidence of users' gaze on particular objects iii) building a next generation museum guide called Museum Guide 2.0 as a prototype application of gaze-based information provision in a real-world environment. We performed several experiments to evaluate our gazebased object recognition methods. Furthermore, we conducted a user study in the context of Museum Guide 2.0 to evaluate the usability of the new gaze-based interface for information provision. These results show that an enormous amount of potential exists for using a wearable eye tracker as a human-environment interface.
This paper presents a new approach to table structure recognition as well as to layout analysis. The discussed recognition process differs significantly from existing approaches as it realizes a bottom-up clustering of given word segments, whereas conventional table structure recognizers all rely on the detection of some separators such as delineation or significant white space to analyze a page from the top-down. The following analysis of the recognized layout elements is based on the construction of a tile structure and detects row-and/or column spanning cells as well as sparse tables with a high degree of confidence. The overall system is completely domain independent, optionally neglects textual contents and can thus be applied to arbitrary mixed-mode documents (with or without tables) of any language and even operates on low quality OCR documents (e.g. facsimiles). S . -W. Le e an d Y . N akan o ( E d s. ): DA S ' 98, LNCS 1655, p p . 255-270 , 1999. c S p r i n ge r-Ve rl ag Be rl i n He i d e l b e rg 1999 Thomas Kieninger and Andreas DengelWords are our elementary objects. Their bounding box geometry and optionally their textual contents constitute the input of the system. Formally, a word is described as a triple W = (T, G, A), where T keeps the textual contents, G denotes the bounding box geometry, specified by the quadruple G = (x 0 , y 0 , x 1 , y 1 ) and A holds the recognized font attributes.Lines (also referred to as text lines) are an initially built aggregation of words that serves as auxiliary structure for all following procedures! They are described as quadruples L = (W 0 , succ, G, A), specifying a sorted list of words (with W 0 naming the first word and succ the appropriate successor function), the bounding box G and the attributes A = (linenumber, rownumber, spc len) that hold the unique linenumber, the logical rownumber (see Sect. 3.5) and average space width.Blocks are dynamic aggregations of words. Note that blocks are not aggregations of lines! The initial word-to-block mapping is made by the central clustering algorithm but is changed by the various postprocessing steps. Blocks thus keep the main segmentation information.Like lines, blocks are described as quadruple B = (W 0 , succ, G, A). The additional block attributes A = (type, justif ication, height, nmbwords) describe the classification into type 1 and 2 (see Sect. 3.2), the justification, height (as number of lines) and number of words in the block. DocumentThe document is defined as quadruple D = (B 0 , succ, G, A), where B 0 and succ specify a sorted list of blocks. The successor function succ doc (s) is defined for words, lines and blocks. The attributes A = (l spc) contain the average linespacing.Significantly large distances between adjacent lines cause the construction of dummy lines between them. This prevents the succ doc (line) function from bridging large line spacings as between paragraphs.In our notation, objects of higher level instances are denoted in a functional way: block(w x ) stands for the associated block of...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.