On Combining Multiple Segmentations in Scene Text Recognition

Neumann, Luka; Matas, Jiřı́

doi:10.1109/icdar.2013.110

Cited by 86 publications

(31 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although many approaches (e.g., [1], [2], [3]) have been proposed, this problem remains largely unsolved, e.g., the winning team in ICDAR-2013 "Reading Text in Scene Images" competition achieved only a localization recall of about 66% [4]. The difficulties mainly come from diversities of texts (e.g., languages, font, size, color, orientation, noise, illumination, low contrast, occlusion and so on) as well as the complexity of the backgrounds [5].…”

Section: Introductionmentioning

confidence: 99%

“…Existing text detection methods can be categorized into three groups: sliding window based methods (e.g., [6], [7], [8]), connected component (CC) based methods (e.g., [1], [2], [5], [9]) and hybrid methods (e.g., [3], [10]). Among them, the extremal-region (ER) based methods, which belong to the connected component based methods, won the first places in both ICDAR-2011 and ICDAR-2013 competitions ( [11], [4]).…”

Section: Introductionmentioning

confidence: 99%

“…2(c). Existing solutions (e.g., [1], [2], [9]) mainly include or combine the following three steps: 1) Preprune non-text components with handcrafted features and classifiers (e.g., random forests, SVM); 2) Group remaining components into candidate text-lines; 3) Verify each candidate line. Although text-line information can reduce ambiguity and improve classification accuracy indeed, candidate text-line grouping itself is an open problem, especially when image layout or background is complex.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks

Sun

Huo

Jia

et al. 2014

2014 22nd International Conference on Pattern Recognition

View full text Add to dashboard Cite

This paper presents a robust text detection approach based on generalized color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its grayscale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images, respectively. From each component-tree, generalized color-enhanced CERs are extracted as character candidates. By using a "divide-andconquer" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Squarelarge and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguityfree learning strategy. After pruning non-text components, repeating components in each component-tree are pruned by using color and area information to obtain a component graph, from which candidate text-lines are formed and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters and split text lines into words as appropriate. Our proposed method achieves 85.72% recall, 87.03% precision, and 86.37% F-score on ICDAR-2013 "Reading Text in Scene Images" test set.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks

Sun

Huo

Jia

et al. 2014

2014 22nd International Conference on Pattern Recognition

View full text Add to dashboard Cite

show abstract

“…In end-to-end method [2] individual characters were detected as Extremal Regions. The regions were first agglomerated into text lines by an efficient pruned exhaustive search that estimates the text direction on each triplet of regions and the constraints induced by the text direction contribute to the similarity measure used for clustering.…”

Section: Existing Text Recognition Methodsmentioning

confidence: 99%

Text Recognition from Complex Colored Images using Neural Network with Discriminative Feature Extraction

Devi¹,

Sathyanarayanan²,

Sumathi³

2017

IJCA

View full text Add to dashboard Cite

The objective of this paper is to project a new methodology for text recognition from the features of segmented text component of images. Text classification algorithm is the main decision making stage of text recognition system. Artificial neural network approach has been used to train and test the character based on the extracted features. Finally, the identified texts are converted in to readable/editable version of text file. KeywordsText extraction, Feature extraction,Text Recognition, Neural Network, Back Propagation. . INTRODUCTIONThe aim of text recognition is to recognize and covert human readable text image characters to machine readable characters. Classification stage is the main decision making stage of text recognition system and uses the features extracted in the previous stage to identify the text component according to the features extracted. . EXISTING TEXT RECOGNITION METHODSText recognition stage is the main decision making stage of text recognition system. Various classifiers techniques are proposed in the literature and are used for the recognition of text. Some of them are multi-level slice classifier, minimum distance classifier, maximum likelihood classifier, fuzzy measure, artificial neural network, support vector machines, decision tree etc.A robust method [1] that uses convolutional co-occurrence histogram of oriented gradient (ConvCoHOG) and discriminative than both the histogram of oriented gradient (HOG) and the co-occurrence histogram of oriented gradients (CoHOG).An image was first divided into smaller patches and feature extraction procedure was applied in every patch separately to extract features. The orientation of gradient of each pixel within a patch is then quantized into histogram bins and then, normalized histogram was concatenated together to form a feature vector ant it was trained by al linear SVM classifier.In end-to-end method [2] individual characters were detected as Extremal Regions. The regions were first agglomerated into text lines by an efficient pruned exhaustive search that estimates the text direction on each triplet of regions and the constraints induced by the text direction contribute to the similarity measure used for clustering. In the next stage, each region in the text line was labeled by the character recognition module, which was trained on synthetic fonts. Regions with low confidence were rejected, which eliminates clutter regions that were included in the text line formation stage. In the last step, a directed graph was constructed with corresponding scores assigned to each node and edge, the scores were normalized by width of the area that they represent and a standard dynamic programming algorithm was used to select the path with the highest score. The sequence of regions and their labels induced by the optimal path was the output of the method.Gokhan Yildirim et.al [3] proposed a technique to detect and recognize text in a unified manner by searching for words directly without reducing the image into text regions or individual charact...

show abstract

“…Maximally Stable Extremal Regions (MSERs) [22] have been used widely for scene text detection [25,11,29] and segmentation [25,36,26]. A classifier is trained to separate text from background based on the shape of each MSER region, along with other hand-drafted features.…”

Section: Related Workmentioning

confidence: 99%

Robust and Accurate Text Stroke Segmentation

Qin

Peng

Kim

et al. 2018

2018 IEEE Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

We propose a new technique for the accurate segmentation of text strokes from an image. The algorithm takes in a cropped image containing a word. It first performs a coarse segmentation using a Fully Convolutional Network (FCN). While not accurate, this initial segmentation can usually identify most of the text stroke content even in difficult situations, with uneven lighting and non-uniform background. The segmentation is then refined using a fully connected Conditional Random Field (CRF) with a novel kernel definition that includes stroke width information. In order to train the network, we created a new synthetic data set with 100K text images. Tested against standard benchmarks with pixellevel annotation (ICDAR 2003, ICDAR 2011, and SVT) our algorithm outperforms the state of the art by a noticeable margin.

show abstract

On Combining Multiple Segmentations in Scene Text Recognition

Cited by 86 publications

References 14 publications

Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks

Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks

Text Recognition from Complex Colored Images using Neural Network with Discriminative Feature Extraction

Robust and Accurate Text Stroke Segmentation

Contact Info

Product

Resources

About