2018
DOI: 10.29007/1zvp
|View full text |Cite
|
Sign up to set email alerts
|

"How good is good enough?" Establishing quality thresholds for the automatic text analysis of retro-digitized comics

Abstract: Stylometry in the form of simple statistical text analysis has proven to be a powerful tool for text classification, e.g. in the form of authorship attribution. When analyzing retro-digitized comics, manga and graphic novels, the researcher is confronted with the problem that automated text recognition (ATR) still leads to results that have comparatively high error rates, while the manual transcription of texts remains highly time-consuming. In this paper, we present an approach and measures that specify wheth… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 9 publications
0
1
0
Order By: Relevance
“…Training simple CNNs on relatively small data sets significantly increases recognition, with Dey et al () achieving an accuracy rate of 97% on hand‐drawn symbols. A different approach is taken in Hartel and Dunst (). Using Tesseract 4, the authors show that even imperfectly recognized comics texts can be used for simple text analysis based on a bag‐of‐words model and yield similar results to manually‐transcribed data.…”
Section: Analysis Of Text and Narrative Structurementioning
confidence: 99%
“…Training simple CNNs on relatively small data sets significantly increases recognition, with Dey et al () achieving an accuracy rate of 97% on hand‐drawn symbols. A different approach is taken in Hartel and Dunst (). Using Tesseract 4, the authors show that even imperfectly recognized comics texts can be used for simple text analysis based on a bag‐of‐words model and yield similar results to manually‐transcribed data.…”
Section: Analysis Of Text and Narrative Structurementioning
confidence: 99%