Recent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on recognition task. In this work, we propose convolutional character networks, referred as CharNet, which is an one-stage model that can process two tasks simultaneously in one pass. CharNet directly outputs bounding boxes of words and characters, with corresponding character labels. We utilize character as basic element, allowing us to overcome the main difficulty of existing approaches that attempted to optimize text detection jointly with a RNN-based recognition branch. In addition, we develop an iterative character detection approach able to transform the ability of character detection learned from synthetic data to real-world images. These technical improvements result in a simple, compact, yet powerful onestage model that works reliably on multi-orientation and curved text. We evaluate CharNet on three standard benchmarks, where it consistently outperforms the state-of-theart approaches [25,24] by a large margin, e.g., with improvements of 65.33%→71.08% (with generic lexicon) on ICDAR 2015, and 54.0%→69.23% on Total-Text, on endto-end text recognition. Code is available at: https:// github.com/MalongTech/research-charnet.
Abstract-Text-independent writer identification is challenging due to the huge variation of written contents and the ambiguous written styles of different writers. This paper proposes DeepWriter, a deep multi-stream CNN to learn deep powerful representation for recognizing writers. DeepWriter takes local handwritten patches as input and is trained with softmax classification loss. The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths. In addition, we find that different languages such as English and Chinese may share common features for writer identification, and joint training can yield better performance. Experimental results on IAM and HWDB datasets show that our models achieve high identification accuracy: 99.01% on 301 writers and 97.03% on 657 writers with one English sentence input, 93.85% on 300 writers with one Chinese character input, which outperform previous methods with a large margin. Moreover, our models obtain accuracy of 98.01% on 301 writers with only 4 English alphabets as input.
In this paper, we present a new and efficient clustering approach for scene analysis in sports video. This method is generic and does not require any prior domain knowledge. It performs in an unsupervised manner and relies on the scene likeness analysis of the shots in the video. The two most similar shots are merged into the same scene in each iteration. And this procedure is repeated until the merging stop criterion is satisfied. The stop criterion is defined based on a J value which is defined according to the Fisher Discriminant Function. We call this method J-based Scene Clustering. By using this method, the low-level video content representation shots could be clustered into the midlevel video content representation scenes, which are useful for high-level sports video content analysis such as playbreak parsing, story units detection, highlights extraction and summarization, etc. Experimental results obtained from various types of broadcast sports videos demonstrate the efficacy of the proposed approach. Moreover, in this paper, we also present a simple application of our scene clustering method to story units detection in periodic sports videos like archery video, diving video and so on. The experimental results are encouraging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.