2019
DOI: 10.1016/j.patcog.2019.05.031
|View full text |Cite
|
Sign up to set email alerts
|

Text baseline detection, a single page trained system

Abstract: Nowadays, there are a lot of page images available and the scanning process is quite well resolved and can be done industrially. On the other hand, HTR systems can only deal with single text line images. Segmenting pages into single text line images is a very expensive process which has traditionally been done manually. This is a bottleneck which is holding back any massive industrial document processing. A baseline detection method will be presented here 1 .The initial problem is reformulated as a clustering … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…Algorithms in the second category try to handle more complex layouts, e.g., arbitrary oriented text lines, and they can be sub-categorized as clustering-based methods (e.g., [3], [16]- [18]) and function analysis methods (e.g., [22]- [27]). Clustering-based methods first extract basic elements (interest points, connected components, etc.)…”
Section: Related Work a Text Baseline Detection In Historical Documentsmentioning
confidence: 99%
See 1 more Smart Citation
“…Algorithms in the second category try to handle more complex layouts, e.g., arbitrary oriented text lines, and they can be sub-categorized as clustering-based methods (e.g., [3], [16]- [18]) and function analysis methods (e.g., [22]- [27]). Clustering-based methods first extract basic elements (interest points, connected components, etc.)…”
Section: Related Work a Text Baseline Detection In Historical Documentsmentioning
confidence: 99%
“…Gruuening et al [18] first extracted super pixels with the FAST algorithm [48], and then applied a standard clustering method based on some text line characteristics, e.g., curvilinearity, inter-line spacing and local homogeneity. Pastor [3] first filtered noisy local minima points with Extremely Randomized Trees (ERT) and then performed a modified DBScan [49] algorithm. Function analysis methods try to segment text lines by finding an optimal path across the document image.…”
Section: Related Work a Text Baseline Detection In Historical Documentsmentioning
confidence: 99%
“…One of these tasks is baseline detection which is one of the most critical tasks in text preprocessing stage. This process is critical because its results will affect the following tasks such as skew and slant correction, segmentation and features extraction processes [17][18][19][20][21]. The baseline is a pseudo horizontal line which connects the ascender strokes and descender strokes.…”
Section: Introductionmentioning
confidence: 99%

Novel Algorithm for Baseline Detection of Offline Arabic Handwritten Text Recognition

Ahmad Mustafa Ali Al Masri,
Muhammad Suzuri Hitam,
Wan Nural Jawahir Hj Wan Yussof
et al. 2024
ARASET
“…Although DLA is a very broad and complex field, most works on DLA focus only on automatic page segmentation and classification, either at text-line level (e.g., baseline detection) [7,15] or including region level segmentation [1,18]. For these subproblems, satisfactory results are currently achieved in most cases.…”
Section: Introductionmentioning
confidence: 99%
“…Modern techniques to obtain these elements from raw handwritten text images can yield very accurate results and are fairly robust to image degradation and other typical difficulties of handwritten documents[1,7,15,18].…”
mentioning
confidence: 99%