Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2014
DOI: 10.3115/v1/p14-2027
|View full text |Cite
|
Sign up to set email alerts
|

Generalized Character-Level Spelling Error Correction

Abstract: We present a generalized discriminative model for spelling error correction which targets character-level transformations. While operating at the character level, the model makes use of wordlevel and contextual information. In contrast to previous work, the proposed approach learns to correct a variety of error types without guidance of manuallyselected constraints or language-specific features. We apply the model to correct errors in Egyptian Arabic dialect text, achieving 65% reduction in word error rate ove… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2016
2016
2025
2025

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 21 publications
(7 citation statements)
references
References 17 publications
(10 reference statements)
0
7
0
Order By: Relevance
“…More recent approaches to the spelling error correction problem include Okazaki et al (2008), who suggest a discriminative model for candidate generation in spelling correction and, more generally, string transformation, and Wang et al (2014), who propose an efficient log-linear model for correcting spelling errors, which, similar to the Brill and Moore (2000) model, is based on complex substringto-substring substitutions. Farra et al (2014) suggest a context-sensitive characterlevel spelling error correction model. Gubanov et al (2014) improve the Cucerzan and Brill (2004) model by iterating the application of the basic noisy channel model for spelling correction in a stochastic manner.…”
Section: Introductionmentioning
confidence: 99%
“…More recent approaches to the spelling error correction problem include Okazaki et al (2008), who suggest a discriminative model for candidate generation in spelling correction and, more generally, string transformation, and Wang et al (2014), who propose an efficient log-linear model for correcting spelling errors, which, similar to the Brill and Moore (2000) model, is based on complex substringto-substring substitutions. Farra et al (2014) suggest a context-sensitive characterlevel spelling error correction model. Gubanov et al (2014) improve the Cucerzan and Brill (2004) model by iterating the application of the basic noisy channel model for spelling correction in a stochastic manner.…”
Section: Introductionmentioning
confidence: 99%
“…Rule Based Correction. Rule based approaches compute the edit cost between two text snippets based on weighted finite state machines (WFSM) (Brill and Moore, 2000;Dreyer et al, 2008;Wang et al, 2014;Silfverberg et al, 2016;Farra et al, 2014). WFSM require predefined rules (insertion, deletion, etc., of characters) and a lexicon, which is used to assess the transformations.…”
Section: Related Workmentioning
confidence: 99%
“…Early works have used edit distance to find morphologically similar corrections (Ristad and Yianilos, 1998), noisy channel model for misspellings (Jurafsky and Martin, 2014), and iterative search to improve corrections of distant spelling errors (Gubanov et al, 2014). Word contexts have been shown to be improve the robustness of spell checkers with ngram language model as one approach to incorporate contextual information (Hassan and Menezes, 2013;Farra et al, 2014). Other ways of incorporating contextual information include n-gram statistics capturing the cohesiveness of a candidate word with the given context (Wint et al, 2018).…”
Section: Related Workmentioning
confidence: 99%