Abstract:Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content categorization using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant shape feature that is generic enough to be detected repeatably and segmentation free. We learn a concise, structurally indexed shape lexicon from training by clustering and partitioning feature types through graph cuts. We demo… Show more
“…Portions of this paper appeared in previous conference publications [25,32]. This research was supported by the US Department of Defense under contract MDA-9040-2C-0406.…”
Section: Acknowledgmentsmentioning
confidence: 92%
“…A codebook provides a concise structural organization for associating large varieties of lowlevel features [25], and is efficient because it enables comparison to much fewer feature types.…”
Section: Learning the Shape Codebookmentioning
confidence: 99%
“…Our approach is based on the view that the intricate differences between languages can be effectively captured using low-level segmentation-free shape features and structurally indexed shape descriptors [25]. Low-level local shape features serve well suited for this purpose because they can be detected robustly in practice, without detection or segmentation of high-level entities, such as text lines or words.…”
“…Portions of this paper appeared in previous conference publications [25,32]. This research was supported by the US Department of Defense under contract MDA-9040-2C-0406.…”
Section: Acknowledgmentsmentioning
confidence: 92%
“…A codebook provides a concise structural organization for associating large varieties of lowlevel features [25], and is efficient because it enables comparison to much fewer feature types.…”
Section: Learning the Shape Codebookmentioning
confidence: 99%
“…Our approach is based on the view that the intricate differences between languages can be effectively captured using low-level segmentation-free shape features and structurally indexed shape descriptors [25]. Low-level local shape features serve well suited for this purpose because they can be detected robustly in practice, without detection or segmentation of high-level entities, such as text lines or words.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.