Language Identification in Degraded and Distorted Document Images

Lu, Shijian; Tan, Chew Lim; Huang, Wei‐Hua

doi:10.1007/11669487_21

Cited by 9 publications

(10 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In Ref. [17], we proposed to first remove impulse noise by a size filtering process where the filtering threshold depends heavily on the image resolution. Here, we instead suppress impulse noise by using a center-weighted median filter [19].…”

Section: Document Image Preprocessingmentioning

confidence: 99%

“…In this paper, we adopt the word shape coding scheme reported in our earlier work [17,18] and use it for the document vector construction. Two word shape features are utilized including the character extremum points and the number of horizontal word cuts illustrated in Fig.…”

Section: Word Shape Codingmentioning

confidence: 99%

“…Its number of horizontal word cuts 11 may be ambiguously interpreted as converted from extremum points over character descenders. Besides, instead of searching for the vector element one by one as done in [17,18], the element within a document vector is arranged in a descending order in term of the word frequency so that a newly converted word shape code may locate the matched vector element as soon as possible.…”

Section: Document Vector Constructionmentioning

confidence: 99%

“…[17,18], we report a language filtering identification technique by using a word shape coding scheme, which converts each document image into an electronic document vector that captures the contents of image documents efficiently. In this paper, we adopt that word shape coding scheme and use it for document image retrieval.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Retrieval of machine-printed Latin documents through Word Shape Coding

Tan

2008

Pattern Recognition

Self Cite

View full text Add to dashboard Cite

Section: Document Image Preprocessingmentioning

confidence: 99%

Section: Word Shape Codingmentioning

confidence: 99%

Section: Document Vector Constructionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Retrieval of machine-printed Latin documents through Word Shape Coding

Tan

2008

Pattern Recognition

Self Cite

View full text Add to dashboard Cite

“…[9][16] [1]. in such environment the large volume of data and variety of scripts makes such manual identification unworkable [9] [16]. In such cases the ability to automatically determine the script, and further, the language of a document, would reduce the time and cost of document handling.…”

Section: Introductionmentioning

confidence: 99%