The increasing portability of physical manuscripts to the digital environment makes it common for systems to offer automatic mechanisms for offline Handwritten Text Recognition (HTR). However, several scenarios and writing variations bring challenges in recognition accuracy, and, to minimize this problem, optical models can be used with language models to assist in decoding text. Thus, with the aim of improving results, dictionaries of characters and words are generated from the dataset and linguistic restrictions are created in the recognition process. In this way, this work proposes the use of spelling correction techniques for text post-processing to achieve better results and eliminate the linguistic dependence between the optical model and the decoding stage. In addition, an encoder–decoder neural network architecture in conjunction with a training methodology are developed and presented to achieve the goal of spelling correction. To demonstrate the effectiveness of this new approach, we conducted an experiment on five datasets of text lines, widely known in the field of HTR, three state-of-the-art Optical Models for text recognition and eight spelling correction techniques, among traditional statistics and current approaches of neural networks in the field of Natural Language Processing (NLP). Finally, our proposed spelling correction model is analyzed statistically through HTR system metrics, reaching an average sentence correction of 54% higher than the state-of-the-art method of decoding in the tested datasets.
Resumo Va fase de diagnóstico das manifestações patológicas em fachadas, a etapa de inspeção visual merece destacada atenção em virtude da inerente complexidade (altura, tamanho, dificuldades de acesso e condições de exposição). Nos últimos anos, o uso de técnicas de deep learning para detectar e classificar características específicas em imagens e vídeos vem crescendo cada vez mais e, quando combinado com o uso de veículos aéreos não tripulados (VANT) para a captura de imagens, constitui uma ferramenta que pode auxiliar e automatizar o procedimento de inspeção visual de fachadas. Este artigo teve o objetivo de realizar a análise do processamento digital de imagens para detecção automática de fissuras em revestimentos cerâmicos de edifícios, associada ao VANT ou drone, o que, potencialmente, resultaria em benefícios (prazo, custo e segurança) no que diz respeito ao diagnóstico. Assim, os resultados da pesquisa exibiram a viabilidade técnica da detecção de fissuras por técnicas de PDI. O procedimento é considerado um trabalho complexo quando há elevada variação nas imagens de estudo. No entanto, mesmo diante de um cenário limitante como a falta de datasets públicos para o problema, o projeto conseguiu desenvolver uma metodologia simples e eficiente para o tema para o qual foi proposto.
Automatic handwriting recognition systems are of interest for academic research fields and for commercial applications. Recent advances in deep learning techniques have shown dramatic improvement in relation to classic computer vision problems, especially in Handwritten Text Recognition (HTR). However, several approaches try to solve the problem of deep learning applied to Handwritten Digit String Recognition (HDSR), where it has to deal with the low number of trainable data, while learning to ignore any writing symbol around the digits (noise). In this context, we present a new optical model architecture (Gated-CNN-BGRU), based on HTR workflow, applied to HDSR. The International Conference on Frontiers of Handwriting Recognition (ICFHR) 2014 competition on HDSR were used as baselines to evaluate the effectiveness of our proposal, whose metrics, datasets and recognition methods were adopted for fair comparison. Furthermore, we also use a private dataset (Brazilian Bank Check-Courtesy Amount Recognition), and 11 different approaches from the state-of-the-art in HDSR, as well as 2 optical models from the state-of-the-art in HTR. Finally, the proposed optical model demonstrated robustness even with low data volume (126 trainable data, for example), surpassing the results of existing methods with an average precision of 96.50%, which is equivalent to an average percentage of improvement of 3.74 points compared to the state-of-the-art in HDSR. In addition, the result stands out in the competition's CVL HDS set, where the proposed optical model achieved a precision of 93.54%, while the best result so far had been from Beijing group (from the competition itself), with 85.29%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.