2015
DOI: 10.1162/coli_a_00216
|View full text |Cite
|
Sign up to set email alerts
|

Spelling Error Patterns in Brazilian Portuguese

Abstract: Fifty years after Damerau set up his statistics for the distribution of errors in typed texts, his findings are still used in a range of different languages. Because these statistics were derived from texts in English, the question of whether they actually apply to other languages has been raised. We address this issue through the analysis of a set of typed texts in Brazilian Portuguese, deriving statistics tailored to this language. Results show that diacritical marks play a major role, as indicated by the fr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Some writing systems (such as Arabic, Vietnamese, or Slovak) use different character variants that change the meaning of the word. The authors in [10] confirmed that the omission of diacritics is a common type of spelling error in Brazilian Portuguese. Texts in Modern Standard Arabic are typically written without diacritical markings [11].…”
Section: Spelling Errorsmentioning
confidence: 69%
“…Some writing systems (such as Arabic, Vietnamese, or Slovak) use different character variants that change the meaning of the word. The authors in [10] confirmed that the omission of diacritics is a common type of spelling error in Brazilian Portuguese. Texts in Modern Standard Arabic are typically written without diacritical markings [11].…”
Section: Spelling Errorsmentioning
confidence: 69%
“…However, these rules are exclusively applicable to English, so they could not be generalized to other languages. In recent years, studies have also been conducted on error patterns for other languages, namely Portuguese (Gimenes, Roman & Carvalho, 2015), Hungarian (Siklósi, Novák & Prószéky, 2016), Japanese (Baba & Suzuki, 2012), Danish (Paggio, 2000), and Punjabi (Lehal & Bhagat, 2007).…”
Section: Error Typologiesmentioning
confidence: 99%
“…Building misspelling detector and serve it before corrector is essential, however detection compared with correction is much less discussed, and the importance of applying detection in real production associated with the most widely adopted noisy-channel approach is even yet disclosed. Some works discuss misspelling detection, correction and pattern analysis from language perspective, such as Arabic [3], Portuguese [10], Persian [26], however the methods are either around noisy-channel paradigm or only language model or mainly the pattern analysis. We also see related discussions for domain-specific applications such as for clinical text [16][25] [26], and indeed search [17][25] [20] as well, however they either focus on patterns or improving a specific area under the noisy-channel architecture such as efficient dictionary or matching technique.…”
Section: Related Workmentioning
confidence: 99%