2017
DOI: 10.1016/j.patcog.2017.01.032
|View full text |Cite
|
Sign up to set email alerts
|

Improving patch-based scene text script identification with ensembles of conjoined networks

Abstract: This paper focuses on the problem of script identification in scene text images. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
34
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 75 publications
(35 citation statements)
references
References 59 publications
0
34
0
1
Order By: Relevance
“…-We experimentally demonstrate that script identification is not required to recognize multi-language text. Unlike competing methods [12,33], E2E-MLT performs script identification from the OCR output using a simple majority voting mechanism over the predicted characters.…”
Section: Introductionmentioning
confidence: 99%
“…-We experimentally demonstrate that script identification is not required to recognize multi-language text. Unlike competing methods [12,33], E2E-MLT performs script identification from the OCR output using a simple majority voting mechanism over the predicted characters.…”
Section: Introductionmentioning
confidence: 99%
“…The MLP has exhibited a maximum accuracy of 90% among them and time complexity is significantly high. Lluis Gomez et al [10] have presented the scene text script identification by the improved patch-based method.…”
Section: Literature Surveymentioning
confidence: 99%
“…[12], and MLe2e. [8]. The CVSI-2015 dataset contains 10 scripts namely; English, Hindi, Bengali, Oriya, Gujarathi, Punjabi, Kannada, Tamil, Telugu, Arab and the dataset size of 10,665; SIW-13 has the 13 scripts; Tibetan, Thai, Russian, Mongolian, Korean, Kannada, Japanese, Hebrew, Greek, English, Chinese, Cambodian, and Arabic.…”
Section: Data Collectionmentioning
confidence: 99%
“…There were multistage training process and great computation due to clustering. Inspired by Siamese network [14], Gomez et al [15] proposed an improved patch-based method containing an ensemble of identical nets to learn discriminative strokepart representations. Mei et al [16] adopted Convolutional Recurrent Neural Networks [17] to extract the image representation and spatial dependency which is discriminative in spite of sharing characters.…”
Section: Introductionmentioning
confidence: 99%
“…As for the problem of arbitrary aspect ratios, recent methods with good performance take densely cropped image patches with fixed size as input [12], [13], [15], [20]. They also employ data augmentation somehow, but they suffered from the following three issues.…”
Section: Introductionmentioning
confidence: 99%