Congjun Long scite author profile

Congjun Long

3Publications

17Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Institute of Ethnology and Anthropology, Chinese Academy of Social Sciences, Minzu University of China

Publications

Order By: Most citations

Segmentation and Recognition for Historical Tibetan Document Images

Long

Duan

et al. 2020

IEEE Access

View full text Add to dashboard Cite

As a shining pearl in traditional Tibetan culture, historical Tibetan documents have received extensive attention from historians, linguists and Buddhist scholars. These documents are converted into digital form using Tibetan document segmentation and recognition methods. The document digitization is of great significance for the research, protection and inheritance of Tibetan history. This paper proposes an overall segmentation and recognition framework for historical Tibetan document images. Firstly, the historical Tibetan document image is preprocessed to correct imbalanced illumination, tilt and noises, and is further transformed into the binarized image. Secondly, we propose a layout segmentation method based on block projection to segment Tibetan document images into texts, lines and frames. Thirdly, in order to solve the problems of touching strokes between text-lines and curvilinear text-lines, we present a text-line segmentation method based on graph model for historical Tibetan text-line segmentation. Lastly, we present a touching segmentation method to segment touching Tibetan character string, and then recognize Tibetan characters. Experimental results show our proposed methods on layout segmentation, text-line segmentation and touching character string segmentation, achieve the satisfactory performance. The proposed methods can also be applied to other fonts in Tibetan font family.

show abstract

Tibetan Word Segmentation Based on Word-Position Tagging

Kang

Jiang²,

Long³

2013

View full text Add to dashboard Cite

The best advantage of Tibetan word segmentation based on word-position is to reduce segmentation errors for unknown words. In this article authors upgrade usual 4-tag set to 6-tag set to fit in with the features of Tibetan characters, using CRF as tagging model to train and test corpus data, then building post processing modules to revise the result data. The experimental result shows that this method achieves a good performance and deserves further study, including expanding the corpus and optimizing the tag set and feature templates.

show abstract

Tibetan Word Segmentation as Sub-syllable Tagging with Syllable’s Part-of-Speech Property

Liu

Long

Nuo

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Congjun Long

Segmentation and Recognition for Historical Tibetan Document Images

Tibetan Word Segmentation Based on Word-Position Tagging

Tibetan Word Segmentation as Sub-syllable Tagging with Syllable’s Part-of-Speech Property

Contact Info

Product

Resources

About