2019
DOI: 10.5167/uzh-177164
|View full text |Cite
|
Sign up to set email alerts
|

Improving OCR of Black Letter in Historical Newspapers: The Unreasonable Effectiveness of HTR Models on Low-Resolution Images

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 0 publications
0
0
0
Order By: Relevance
“…Despite being listed in line here, it should be noted that the above methods cannot be measured reliably against each other, as they are diverse in architecture and function (i.e., Kraken and Tesseract are OCR engines, while eScriptorium and Transkribus are interface platforms for HTR). Regardless, already conducted experiments [Ströbel and Clematide, 2019;Ströbel, Clematide and Volk, 2020;Clérice, 2022b] have demonstrated that from the HTR mentioned above tools, Transkribus and e-Scriptorium (which implements Kraken) are the most successful in producing low CER (Character Error Rate) text recognitions. Part of this success is due to the different and more efficient layout analysis performed by both Transkribus and eScriptorium, an analysis that does not restrict segmentation in rectangular regions, as handwritten text can expand in many forms and directions [Stokes et al, 2021].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Despite being listed in line here, it should be noted that the above methods cannot be measured reliably against each other, as they are diverse in architecture and function (i.e., Kraken and Tesseract are OCR engines, while eScriptorium and Transkribus are interface platforms for HTR). Regardless, already conducted experiments [Ströbel and Clematide, 2019;Ströbel, Clematide and Volk, 2020;Clérice, 2022b] have demonstrated that from the HTR mentioned above tools, Transkribus and e-Scriptorium (which implements Kraken) are the most successful in producing low CER (Character Error Rate) text recognitions. Part of this success is due to the different and more efficient layout analysis performed by both Transkribus and eScriptorium, an analysis that does not restrict segmentation in rectangular regions, as handwritten text can expand in many forms and directions [Stokes et al, 2021].…”
Section: Literature Reviewmentioning
confidence: 99%