1982
DOI: 10.1016/0146-664x(82)90059-4
|View full text |Cite
|
Sign up to set email alerts
|

Block segmentation and text extraction in mixed text/image documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0

Year Published

1996
1996
2013
2013

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 341 publications
(44 citation statements)
references
References 2 publications
0
44
0
Order By: Relevance
“…The use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) which is a modified version of the state-of-the-art RLSA [5] and efficiently groups homogeneous document regions. The definition of background obstacles that ALRSA is not allowed crossing in order to avoid merging neighboring text columns or text lines.…”
Section: Proposed Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…The use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) which is a modified version of the state-of-the-art RLSA [5] and efficiently groups homogeneous document regions. The definition of background obstacles that ALRSA is not allowed crossing in order to avoid merging neighboring text columns or text lines.…”
Section: Proposed Methodologymentioning
confidence: 99%
“…These techniques can be categorized based on the document image segmentation algorithm that they adopt. The most known of these segmentation algorithms are the following: X-Y cuts or projection profiles based [4], Run Length Smoothing Algorithm (RLSA) [5], component grouping [6], document spectrum [7], whitespace analysis [8], constrained text lines [9], Hough transform [10,11], Voronoi tessellation [12] and Scale space analysis [13]. All of the above segmentation algorithms are mainly designed for contemporary documents.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It then segregates the image from the large white rectangles. The run-length smearing method [21], on the other hand, seeks to enter at the bottom of the hierarchy. It goes along each scan line and blackens the small spaces between black pixels.…”
Section: Document Layout Analysismentioning
confidence: 99%
“…There are two classes of document segmentation methods. The first class uses bottom-up techniques, including O'Gorman's Docstrum algorithm [8], the Voronoi diagram based algorithm of Kise et al [9], the run-length smearing algorithm of Wahl et al [10], the segmentation algorithm of Jain and Yu [11], the text string separation algorithm of Fletcher and Kasturi [12], the 'white tiles' method of Antonacopoulos [13] and Mitchell and Yan's pattern spread and soft ordering methods [14,15]. The second class are top-down techniques, including the X -Y-cut-based algorithm of Nagy et al [16], and the shape-directed-covers based algorithm of Baird et al [17].…”
Section: Introductionmentioning
confidence: 99%