Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion

Chandio, Asghar Ali; Asikuzzaman, Md.; Pickering, Mark

doi:10.1109/access.2020.3001605

Cited by 28 publications

(19 citation statements)

References 77 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In order to verify the effectiveness of each component of the proposed method, we propose two variants based on the ETM method: single-ETM and TM. The single-ETM method 3 https://www.openml.org/ uses a single-objective optimization framework, which only considers the accuracy performance of the model to be optimized. The other settings are consistent with the ETM method.…”

Section: ) Comparison Methodsmentioning

confidence: 99%

See 1 more Smart Citation

ETM: Effective Tuning Method Based on Multi-Objective and Knowledge Transfer in Image Recognition

Liu

Zhao

2021

IEEE Access

View full text Add to dashboard Cite

With the widespread application of machine learning and deep learning, image recognition has been continuously developed. However, there are still huge challenges in the use of machine learning and deep learning. The tuning processes of algorithms are critical and challenging for their performance. Although there have been many previous works to improve the final accuracy of the recognition algorithms through tuning, these works cannot consider some indicators that are also very important in the actual environment (such as latency, central processing unit (cpu) utilization) in the tuning. In this paper, we propose an effective tuning method based on multi-objective and knowledge transfer, which is solved the above limitations in the image recognition. Specifically, we first use an agent to automatically tune the recognition algorithms, and combine the prediction accuracy and the running latency of each episode as a multiobjective reward signal to guide the update of the internal parameters of the agent. In this way, the agent can continuously select the better algorithm configuration to improve prediction performance. In addition, we improve the efficiency of the above tuning process by transferring knowledge. To do that, we can learn the meta parameters from other small-scale tasks to initialize the agent. In the experiments, we apply the proposed method to tune the eXtreme Gradient Boosting and random forest on 57 image recognition tasks and convolutional neural network on 2 tasks. The experimental results verify that the proposed method achieves average accuracy rankings of 1.92, 1.42 and 1.71 on three algorithms to be optimized, respectively. Especially in terms of latency performance, the proposed method performs best on all the tasks (57 data sets) on the three algorithms to be optimized. In addition, we verify the various components of the proposed method through ablation experiments.

show abstract

Section: ) Comparison Methodsmentioning

confidence: 99%

“…So far, machine learning and deep learning has made great progress in many works on the image recognition field [1]- [3]. However, machine learning and deep learning still need many tedious processes in practical applications.…”

Section: Introductionmentioning

confidence: 99%

ETM: Effective Tuning Method Based on Multi-Objective and Knowledge Transfer in Image Recognition

Liu

Zhao

2021

IEEE Access

View full text Add to dashboard Cite

show abstract

“…They have been adapted for feature extraction in many text recognition systems. We can cite for example: scene text recognition [12], [13], video text recognition [14], and offline handwriting text recognition [15]- [17]. However, CNN-based or DL-based approaches are still deficient.…”

Section: ) Feature Extractionmentioning

confidence: 99%

Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition

et al. 2021

View full text Add to dashboard Cite

One of the most recent challenging issues of pattern recognition and artificial intelligence is Arabic text recognition. This research topic is still a pervasive and unaddressed research field, because of several factors. Complications arise due to the cursive nature of the Arabic writing, character similarities, unlimited vocabulary, use of multi-size and mixed-fonts, etc. To handle these challenges, an automatic Arabic text recognition requires building a robust system by computing discriminative features and applying a rigorous classifier together to achieve an improved performance. In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition. Our proposed system, termed BoF-deep SAE-HMM, is tested on four datasets, namely the printed Arabic line images Printed KHATT (P-KHATT), the benchmark printed word images Arabic Printed Text Image (APTI), the benchmark handwritten Arabic word images IFN/ENIT, and the benchmark handwritten digits images Modified National Institute of Standards and Technology (MNIST).

show abstract

“…The output feature map Z can be regarded as the set of all channel feature maps Z k . Finally, calculate the activation value V k of each channel, where Z k (i, j ) is the pixel activation value of the feature map of channel k, as shown in Equation (5).…”

Section: Channel Quantization and Deep Neural Networkmentioning

confidence: 99%

“…IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology recognition may suffer from complex backgrounds, uneven lighting, low contrast, blurry texts, text directions, font colors, writing styles, and mixed languages. Hence, character recognition in natural scenes has become a major study focus in this field [4][5][6]. In recent years, with the widespread attention to the bag-of-words (BOW) [7], many methods have been proposed to segment a character into small image "words", such as the end of a stroke, a curved stroke, or a cross stroke, and these small words can successfully increase the recognition rate.…”

Section: Introductionmentioning

confidence: 99%

ILBPSDNet: Based on improved local binary pattern shallow deep convolutional neural network for character recognition

Lee

Yang

2021

IET Image Processing

View full text Add to dashboard Cite

This paper proposes an architecture based on the improved local binary pattern (LBP) shallow deep convolution neural network, which integrates hand-crafted feature preprocessing and the advantage of character learning in the supervised high-level function of CNN, in order to enhance its performance. This study introduced the information of scale space into the LBP to reduce the sensitivity to noise, and applied feature maps with two features, the maximum selection feature map (MLBP) and the first selection feature map (FLBP). The former selected the edge with the strongest intensity to reduce the influence of noise points, while the latter measured local binary features through the scale detection of an effective edge. In the network architecture design, according to the differences of input features, networks of different depths were used for learning, and the features learned by the two networks were adopted for classification. The experimental results show that, the ILBPSDNet proposed had certain recognition abilities in many character data sets, and the network parameters and computation were also reduced. Therefore, it has a significant effect in realizing the application of real-time character recognition. Finally, compared with other latest networks, its network performance could be maintained at a certain level.

show abstract

Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion

Cited by 28 publications

References 77 publications

ETM: Effective Tuning Method Based on Multi-Objective and Knowledge Transfer in Image Recognition

ETM: Effective Tuning Method Based on Multi-Objective and Knowledge Transfer in Image Recognition

Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition

ILBPSDNet: Based on improved local binary pattern shallow deep convolutional neural network for character recognition

Contact Info

Product

Resources

About