2006
DOI: 10.1007/11669487_1
|View full text |Cite
|
Sign up to set email alerts
|

Retrieval from Document Image Collections

Abstract: This paper presents a system for retrieval of relevant documents from large document image collections. We achieve effective search and retrieval from a large collection of printed document images by matching image features at word-level. For representations of the words, profile-based and shape-based features are employed. A novel DTWbased partial matching scheme is employed to take care of morphologically variant words. This is useful for grouping together similar words during the indexing process. The syste… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
31
0

Year Published

2006
2006
2020
2020

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 43 publications
(31 citation statements)
references
References 4 publications
0
31
0
Order By: Relevance
“…Three coarse features such as word profiles, structural features and transform domain represented was addressed to indicate the features of a word image [13] [3]. Feature values are normalized such that the word representations become insensitive to variations in size, fonts and degradations.…”
Section: Related Workmentioning
confidence: 99%
“…Three coarse features such as word profiles, structural features and transform domain represented was addressed to indicate the features of a word image [13] [3]. Feature values are normalized such that the word representations become insensitive to variations in size, fonts and degradations.…”
Section: Related Workmentioning
confidence: 99%
“…Besides the works of William Clocksin [1], no previous work has been published on Syriac handwriting recognition. Different approaches exist for handwriting recognition in historical manuscripts.…”
Section: Related Workmentioning
confidence: 99%
“…To overcome the morphological differences between the words, the matching is performed using a Dynamic Time Warping (DTW) algorithm. DTW is also used in [1] to match whole words. Another segmentation-free approach which uses HMMs and statistical language models for handwritten text recognition is described in [13].…”
Section: Related Workmentioning
confidence: 99%
“…However, we annotate each of the clusters, instead of directly using them to build the index. An attempt at manual annotation of word image clusters was reported in [23], which is generally un-affordable. The motivation to use an annotation based approach comes from recent interest in automatic annotation [16,17,15].…”
Section: Related Workmentioning
confidence: 99%