Abstract-Reading text from photographs is a challenging problem that has received a significant amount of attention. Two key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, we apply methods recently developed in machine learning-specifically, large-scale algorithms for learning the features automatically from unlabeled data-and show that they allow us to construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system.
Abstract-We consider the problem of automatically collecting semantic labels during robotic mapping by extending the mapping system to include text detection and recognition modules. In particular, we describe a system by which a SLAMgenerated map of an office environment can be annotated with text labels such as room numbers and the names of office occupants. These labels are acquired automatically from signs posted on walls throughout a building. Deploying such a system using current text recognition systems, however, is difficult since even state-of-the-art systems have difficulty reading text from non-document images. Despite these difficulties we present a series of additions to the typical mapping pipeline that nevertheless allow us to create highly usable results. In fact, we show how our text detection and recognition system, combined with several other ingredients, allows us to generate an annotated map that enables our robot to recognize named locations specified by a user in 84% of cases.
We present OpenSeq2Seq-an opensource toolkit for training sequence-tosequence models. The main goal of our toolkit is to allow researchers to most effectively explore different sequence-tosequence architectures. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq provides building blocks for training encoder-decoder models for neural machine translation and automatic speech recognition. We plan to extend it with other modalities in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.