Convolutional Attention Networks for Scene Text Recognition

Xie, Hongtao; Fang, Shancheng; Zha, Zheng-Jun; Yang, Ya-Ting Carolyn; Li, Yan; Zhang, Yongdong

doi:10.1145/3231737

Cited by 62 publications

(21 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, robustness to distortion and generality to variant language are challenging for these systems. To explore the advancement in TIE techniques, [57] and as encoder in attention mechanism outperformed others [56]. Although, these techniques are showing promising results, but diversity in data sources makes the system complex [55].…”

Section: Text Recognitionmentioning

confidence: 99%

“…CNN based OCR have also shown pretty good results but the performance of technique on unstructured big datasets is still to be investigated. The attention mechanism is a new approach in text recognition [54,56]. Initially, the results are satisfactory but there is a huge room for improvement in terms of unstructured and multidimensional big data.…”

Section: Text Recognitionmentioning

confidence: 99%

See 1 more Smart Citation

An analytical study of information extraction from unstructured and multidimensional big data

2019

View full text Add to dashboard Cite

IntroductionInformation extraction (IE) process extracts useful structured information from the unstructured data in the form of entities, relations, objects, events and many other types. The extracted information from unstructured data is used to prepare data for analysis. Therefore, the efficient and accurate transformation of unstructured data in the IE process improves the data analysis. Numerous techniques have been introduced for different data types i.e. text, image, audio, and video.The advancement in technology promoted the rapid growth of data volume in recent years. The volume, variety (structured, unstructured, and semi-structured data) and velocity of big data have also changed the paradigm of computational capabilities of the systems. IBM estimated that more than 2.5 quintillion bytes of data are generated every Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive. which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Adnan and Akbar J Big Data (2019) 6:91 Malaysia Adnan and Akbar J Big Data (2019) 6:91 RESEARCHday. Among these statistics, it was also predicted that unstructured data from diverse sources will grow up to 90% in few years. IDC estimated that unstructured data will be 95% of the global data in 2020 with estimated 65% annual growth rate [1]. The common characteristics of unstructured data are, (i) it comes in multiple formats...

show abstract

Section: Text Recognitionmentioning

confidence: 99%

Section: Text Recognitionmentioning

confidence: 99%

An analytical study of information extraction from unstructured and multidimensional big data

2019

View full text Add to dashboard Cite

show abstract

“…SURF [54] and ROI based visual codebook are respectively applied for adult image detection [40]. Some other approaches [45][46][47][48][49][50][51][52][53] are worthy to be noticed.…”

Section: Related Workmentioning

confidence: 99%

Analyzing periodicity and saliency for adult video detection

Liu

Huang

et al. 2019

Multimed Tools Appl

View full text Add to dashboard Cite

Content-based adult video detection plays an important role in preventing pornography.However, existing methods usually rely on single modality and seldom focus on multi-modality semantics representation. Addressing at this problem, we put forward an approach of analyzing periodicity and saliency for adult video detection. At first, periodic patterns and salient regions are respectively analyzed in audio-frames and visual-frames. Next, the multi-modal co-occurrence semantics is described by combining audio periodicity with visual saliency. Moreover, the performance of our approach is evaluated step by step. Experimental results show that our approach obviously outperforms some state-of-the-art methods.

show abstract

“…Inspired by the recent successes of convolutional neural networks (CNNs) [12]- [15] in high level computer vision tasks, deep neural networks (DNNs) emerged in addressing low level computer vision tasks as well [16]- [23]. For the task of arXiv:1812.10836v3 [cs.CV] 17 Nov 2019 image inpainting, Pathak et al [21] presented an auto-encoder to perform context-based image inpainting.…”

Section: Introductionmentioning

confidence: 99%

Adaptive Image Sampling Using Deep Learning and Its Application on X-Ray Fluorescence Image Reconstruction

Dai

Chopp

Pouyet

et al. 2020

IEEE Trans. Multimedia

View full text Add to dashboard Cite

This paper presents an adaptive image sampling algorithm based on Deep Learning (DL). It consists of an adaptive sampling mask generation network which is jointly trained with an image inpainting network. The sampling rate is controlled by the mask generation network, and a binarization strategy is investigated to make the sampling mask binary. In addition to the image sampling and reconstruction process, we show how it can be extended and used to speed up raster scanning such as the X-Ray fluorescence (XRF) image scanning process. Recently XRF laboratory-based systems have evolved into lightweight and portable instruments thanks to technological advancements in both X-Ray generation and detection. However, the scanning time of an XRF image is usually long due to the long exposure requirements (e.g., 100µs − 1ms per point). We propose an XRF image inpainting approach to address the long scanning times, thus speeding up the scanning process, while being able to reconstruct a high quality XRF image. The proposed adaptive image sampling algorithm is applied to the RGB image of the scanning target to generate the sampling mask. The XRF scanner is then driven according to the sampling mask to scan a subset of the total image pixels. Finally, we inpaint the scanned XRF image by fusing the RGB image to reconstruct the full scan XRF image. The experiments show that the proposed adaptive sampling algorithm is able to effectively sample the image and achieve a better reconstruction accuracy than that of existing methods.

show abstract

Convolutional Attention Networks for Scene Text Recognition

Cited by 62 publications

References 27 publications

An analytical study of information extraction from unstructured and multidimensional big data

An analytical study of information extraction from unstructured and multidimensional big data

Analyzing periodicity and saliency for adult video detection

Adaptive Image Sampling Using Deep Learning and Its Application on X-Ray Fluorescence Image Reconstruction

Contact Info

Product

Resources

About