2007
DOI: 10.1109/icdar.2007.4377099
|View full text |Cite
|
Sign up to set email alerts
|

Context-Sensitive Error Correction: Using Topic Models to Improve OCR

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(14 citation statements)
references
References 7 publications
0
14
0
Order By: Relevance
“…We believe that further improvements can be achieved by using the clean lists in conjunction with more sophisticated models, such as document-specific language models, as suggested by [19]. In addition, we believe that the clean lists can also be used to re-segment and fix the large percentage of initial errors that result from incorrect character segmentation.…”
Section: Resultsmentioning
confidence: 99%
“…We believe that further improvements can be achieved by using the clean lists in conjunction with more sophisticated models, such as document-specific language models, as suggested by [19]. In addition, we believe that the clean lists can also be used to re-segment and fix the large percentage of initial errors that result from incorrect character segmentation.…”
Section: Resultsmentioning
confidence: 99%
“…People-LDA model [23] combined hyper-feature based face identifier and LDA model to center topics around people. Wick et al [24] used topic models to automatically detect and represent an articles semantic context for OCR improvement.…”
Section: Related Workmentioning
confidence: 99%
“…In the past, most of the studies in error detection [2], [3] have focussed on English or very few Latin languages like German. In 1992, Kukich [1] performed experimental analysis with merely few thousands of words, while the methods discussed in 2011 by Smith [4] use a corpus as large as 100 Billion words.…”
Section: Introductionmentioning
confidence: 99%