No abstract
In this paper, we present a new multilingual Optical Character Recognition (OCR) system for scanned documents. In the case of Latin characters, current open source systems such as Tesseract provide very high accuracy. However, the accuracy of the multilingual documents, including Asian characters, is usually lower than that for Latin-only documents. For example, when the document is the mix of English, Chinese and/or Korean characters, the OCR accuracy is lowered than English-only because the character/text properties of Chinese and Korean are quite different from Latin-type characters. To tackle these problems, we propose a new framework using three neural blocks (a segmenter, a switcher, and multiple recognizers) and the reinforcement learning of the segmenter: The segmenter partitions a given word image into multiple character images, the switcher assigns a recognizer for each sub-image, and the recognizers perform the recognition of assigned sub-images. The training of recognizers and switcher can be considered traditional image classification tasks and we train them with a supervised learning method. However, the supervised learning of the segmenter has two critical drawbacks: Its objective function is sub-optimal and its training requires a large amount of annotation efforts. Thus, by adopting the REINFORCE algorithm, we train the segmenter so as to optimize the overall performance, i.e., we minimize the edit distance of final recognition results. Experimental results have shown that the proposed method significantly improves the performance for multilingual scripts and large character set languages without using character boundary labels. INDEX TERMS Deep learning, document analysis, optical character recognition,
The 2015 Varsity Medical Ethics debate convened upon the motion: “This house believes nootropic drugs should be available under prescription”. This annual debate between students from the Universities of Oxford and Cambridge, now in its seventh year, provided the starting point for arguments on the subject. The present article brings together and extends many of the arguments put forward during the debate. We explore the current usage of nootropic drugs, their safety and whether it would be beneficial to individuals and society as a whole for them to be available under prescription. The Varsity Medical Debate was first held in 2008 with the aim of allowing students to engage in discussion about ethics and policy within healthcare. The event is held annually and it is hoped that this will allow future leaders to voice a perspective on the arguments behind topics that will feature heavily in future healthcare and science policy. This year the Oxford University Medical Society at the Oxford Union hosted the debate.
In hash-based image retrieval systems, the transformed input from the original usually generates different codes, deteriorating the retrieval accuracy. To mitigate this issue, data augmentation can be applied during training. However, even if the augmented samples of one content are similar in real space, the quantization can scatter them far away in Hamming space. This results in representation discrepancies that can impede training and degrade performance. In this work, we propose a novel self-distilled hashing scheme to minimize the discrepancy while exploiting the potential of augmented data. By transferring the hash knowledge of the weakly-transformed samples to the strong ones, we make the hash code insensitive to various transformations. We also introduce hash proxy-based similarity learning and binary cross entropy-based quantization loss to provide fine quality hash codes. Ultimately, we construct a deep hashing framework that generates discriminative hash codes. Extensive experiments on benchmarks verify that our self-distillation improves the existing deep hashing approaches, and our framework achieves state-ofthe-art retrieval results. The code will be released soon.
Deep hashing aims to produce discriminative binary hash codes for fast image retrieval through a deep baseline network and additional trainable hash function. In a supervised deep hashing network, the baseline network is generally initialized with classification-based pretrained models, and the overall hashing network is trained in a supervised fashion. However, since classification and retrieval are two different tasks, it is necessary to reconsider the initial model for the baseline network. In this paper, we propose to use a self-supervised pretrained model as the baseline for the first time. We investigate the impact of pretrained model types by comparing deep hashing networks that use the baseline network with 1) randomly initialized weights, 2) conventional supervised pretrained weights, and 3) proposed self-supervised pretrained weights. As a result, we confirm that the performance of deep hashing differs depending on the initial baseline setting, and the proposed self-supervised baseline model shows comparable or better performance over the supervised one. Our code is released at https://github.com/HaeyoonYang/SSPH.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.