The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2016
DOI: 10.1515/pralin-2016-0004
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Four Character-Level String-to-String Translation Models for (OCR) Spelling Error Correction

Abstract: We consider the isolated spelling error correction problem as a specific subproblem of the more general string-to-string translation problem. In this context, we investigate four general string-to-string transformation models that have been suggested in recent years and apply them within the spelling error correction paradigm. In particular, we investigate how a simple ‘k-best decoding plus dictionary lookup’ strategy performs in this context and find that such an approach can significantly outdo baselines suc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
2

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(32 citation statements)
references
References 26 publications
(32 reference statements)
0
29
2
Order By: Relevance
“…The average performance of the perceptron tagger in this experiment is superior to the performance of the AliSeTra system as reported by Eger et al (2016). The difference in performance is, however, not statistically significant.…”
Section: Resultscontrasting
confidence: 48%
See 3 more Smart Citations
“…The average performance of the perceptron tagger in this experiment is superior to the performance of the AliSeTra system as reported by Eger et al (2016). The difference in performance is, however, not statistically significant.…”
Section: Resultscontrasting
confidence: 48%
“…UC refers to the unstructured classifier presented in Section 3.1, PT to the perceptron tagger presented in Section 3.2 and AliSeTra to the system presented by Eger et al (2016).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Note that when all input forms are incorrect (as in the case of the Twitter data), CR corresponds exactly to the evaluation metric word accuracy (WACC) used by Eger et al (2016) because the count f p is 0. WACC = tp tp + f n Tables 2 and 3 show the results of the experiments on the Finnish OCR data and Twitter data.…”
Section: Methodsmentioning
confidence: 99%