A Smart Image Processing Algorithm for Text Recognition, Information Extraction and Vocalization for the Visually Challenged

Manoharan, S.

doi:10.36548/jiip.2019.1.004

Cited by 85 publications

(17 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…NLP module of the general TTS synthesizer consists of the Pre-processor, text analyzer, and contextual analyzer (2) . The components are automatic phonetization, text analysis, and prosody generation (3,4) . There are several factors which is affected NLP and the final output of digital signal processing work like, quality of the microphone, environmental echo,noise, and sampling frequency.…”

Section: Related Workmentioning

confidence: 99%

“…Synthesized speech can be created by concatenating part of recorded speech which is stored in a database. Speech is often based on concatenation of natural speech that is the units, which are taken from natural speech put together to form a word or sentence (3) .text-to-speech synthesis system has a wide range of applications in everyday life and a text-to-speech synthesizer is used for vocalization processed content (4) . In the last decade, a great deal of TTS-Synthesis system has done much work in various languages as well as different synthesis techniques such as Unit-selection, Formant, Hidden Markov Model, and Articulatory synthesis was done by researchers (5) .…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Text to Speech Synthesizer for Tigrigna Linguistic using Concatenative Based approach with LSTM model

Araya¹,

Alehegn²

2022

IJST

View full text Add to dashboard Cite

Objectives:The purpose of this study is to describe text-to-speech system for the Tigrigna language, using dialog fusion architecture and developing a prototype text-to-speech synthesizer for Tigrigna Language. Methods : The direct observation and review of articles are applied in this research paper to identify the whole strings which are represented the language. Tools used in this work are Mathlab, LPC, and python. In this paper LSTM deep learning model was applied to find out accuracy, precision, recall, and Fscore. Findings: The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78% which is fruitful. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system are found to be 3.28 and 3.27 respectively. Based on the experiment LSTM Deep learning model provides an accuracy of 91.05%, the precision of 78.05%, recall of 86.59 %, and F-score of 83.05% respectively. The values of performance, intelligibility, and naturalness are inspiring and show that diphone speech units are good candidates to develop a fully functional speech synthesizer. Novelty: The researchers come up with the first text to speech LSTM deep learning model for the Tigrigna language which is critical and will be a baseline for other related research to be done for Tigrigna and other languages.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Text to Speech Synthesizer for Tigrigna Linguistic using Concatenative Based approach with LSTM model

Araya¹,

Alehegn²

2022

IJST

View full text Add to dashboard Cite

show abstract

“…The author Manoharan, Samuel et al [9] proposes the utilization of "the image processing techniques in the extracting the texts, recognizing and vocalizing for the people with visual impairments" the method utilizes Journal of Artificial Intelligence and Capsule Networks (2020) Vol.02/ No. 01 Pages: 1-10 http://irojournals.com/aicn/ DOI: https://doi.org/10.36548/jaicn.2020.1.001 the "latte panda alpha" for processing scanned images.…”

Section: Related Workmentioning

confidence: 99%

March 2020

2020

JAICN

View full text Add to dashboard Cite

The classification of the text involving the process of identification and categorization of text is a tedious and a challenging task too. The Capsules Network (Caps-Net) which is a unique architecture with the capability to confiscate the basic attributes comprising the insights of the particular field that could help in bridging the knowledge gap existing between the source and the destination tasks and capability learn more robust representation than the CNN-Convolutional neural networks in the image classification domain is utilized in the paper to classify the text. As the multi -task learning capability enables to part insights between the tasks that are related and enhances data used in training indirectly, the Caps-Net based multi task learning frame work is proposed in the paper. The proposed architecture including the Caps-Net effectively classifies the text and minimizes the interference experienced among the multiple tasks in the multi -task learning. The architecture put forward is evaluated using various text classification dataset ensuring the efficacy of the proffered frame work

show abstract

“…It can play a significant role in many aspects such as post-office automation, national ID number recognition, parking lot management system, and online banking ( Alom et al, 2018 ). This recognition system can also play an essential part in signboard translation, digital character conversation, keyword spotting, scene image analysis, text-to-speech conversion ( Manoharan, 2019 ), meaning translation, and most notably in Bangla optical character recognition (OCR) ( Manisha, Sreenivasa & Sundara Krishna, 2016 ). But it has been a great challenge to provide such a system for Bangla than most other languages.…”

Section: Introductionmentioning

confidence: 99%

Convolutional neural network-based ensemble methods to recognize Bangla handwritten character

Shibly¹,

Tisha²,

Tani³

et al. 2021

PeerJ Computer Science

View full text Add to dashboard Cite

In this era of advancements in deep learning, an autonomous system that recognizes handwritten characters and texts can be eventually integrated with the software to provide better user experience. Like other languages, Bangla handwritten text extraction also has various applications such as post-office automation, signboard recognition, and many more. A large-scale and efficient isolated Bangla handwritten character classifier can be the first building block to create such a system. This study aims to classify the handwritten Bangla characters. The proposed methods of this study are divided into three phases. In the first phase, seven convolutional neural networks i.e., CNN-based architectures are created. After that, the best performing CNN model is identified, and it is used as a feature extractor. Classifiers are then obtained by using shallow machine learning algorithms. In the last phase, five ensemble methods have been used to achieve better performance in the classification task. To systematically assess the outcomes of this study, a comparative analysis of the performances has also been carried out. Among all the methods, the stacked generalization ensemble method has achieved better performance than the other implemented methods. It has obtained accuracy, precision, and recall of 98.68%, 98.69%, and 98.68%, respectively on the Ekush dataset. Moreover, the use of CNN architectures and ensemble methods in large-scale Bangla handwritten character recognition has also been justified by obtaining consistent results on the BanglaLekha-Isolated dataset. Such efficient systems can move the handwritten recognition to the next level so that the handwriting can easily be automated.

show abstract

A Smart Image Processing Algorithm for Text Recognition, Information Extraction and Vocalization for the Visually Challenged

Cited by 85 publications

References 16 publications

Text to Speech Synthesizer for Tigrigna Linguistic using Concatenative Based approach with LSTM model

Text to Speech Synthesizer for Tigrigna Linguistic using Concatenative Based approach with LSTM model

March 2020

Convolutional neural network-based ensemble methods to recognize Bangla handwritten character

Contact Info

Product

Resources

About