2011 7th Iranian Conference on Machine Vision and Image Processing 2011
DOI: 10.1109/iranianmvip.2011.6121550
|View full text |Cite
|
Sign up to set email alerts
|

A Lexicon Reduction Method Based on Clustering Word Images in Offline Farsi Handwritten Word Recognition Systems

Abstract: In this paper a novel approach for lexicon reduction of Farsi words is proposed. For this purpose we extract upper and lower profiles, vertical projection profile and black/white transition from word images. Using DTW similarity between words in the database is measured. The Isoclus algorithm is used to cluster handwritten word images of training dataset. The initial center of clusters is determined from agglomerative hierarchical clustering algorithm.Experimental results on IRANSHAHR dataset show a promising … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…e ENIT/IFN dataset contains 32,492 word-images of Tunisian village and town names and includes five subsets, namely, a, b, c, d, and e. In order to write the vocabulary, more than 1000 writers were employed and this vocabulary entails 946 unique village and city names [58,59]. e known Iranshahr dataset includes nearly 17,000 images of handwritten names of 503 cities of Iran [60][61][62].…”
Section: Resultsmentioning
confidence: 99%
“…e ENIT/IFN dataset contains 32,492 word-images of Tunisian village and town names and includes five subsets, namely, a, b, c, d, and e. In order to write the vocabulary, more than 1000 writers were employed and this vocabulary entails 946 unique village and city names [58,59]. e known Iranshahr dataset includes nearly 17,000 images of handwritten names of 503 cities of Iran [60][61][62].…”
Section: Resultsmentioning
confidence: 99%
“…To implement the proposed method, we used the images of 200 out of 502 city names in the ‘Iranshahr’ dataset [11]. Among the city names having more than 30 samples, 200 cities were randomly selected.…”
Section: Resultsmentioning
confidence: 99%
“…There are about 17,000 images in the database, which means that more than 30 samples are ready for each word class. The database has also been used in [30]. There are also a total of 425 sub-word classes.…”
Section: Databasementioning
confidence: 99%