It is very difficult to analyze form structures because of breaks in lines and ackhtional noises on the form image. This paper focuses on cell recognition in lowquality form images. The recognition method has two features to achieve robustness in cell recognition. One is grid representation using several types of intersection and the terminal points of the frame lines. The other is the recursive modification of the representation. A new representation is created according to the determination of the breaks in the line and the hypothesized location of the missed intersections by using the previous representation. The modification is processed recursively until the representation has perfect consistency and all form cells are detected In an experiment using 1565 form samples, all cells in 1538 samples (98.3% of 1565 samples) were recognized correctly by this method.
Form document structure analysis is an essential technique for recognizing the positions of characters in general forms. However, it has a fundamental problem that interruptions of lines, as well as noise, lead to incorrect analysis. This paper focuses on a method for connecting junction patterns in which portions of the horizontal and vertical lines are not visible, referred to as "disappeared junction patterns." Our method has two key stages for making correct connections. The first is noise elimination, in which lines whose two end points meet no other lines and which are shorter than the minimum line length parameter are eliminated. The second is object line selection, where only frame lines of tables are selected as object lines for connection. Experiments with 39 form images demonstrated the feasibility of this method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.