Proceedings of the Student Research Workshop at the 15th Conference Of the European Chapter of the Association for Co 2017
DOI: 10.18653/v1/e17-4002
|View full text |Cite
|
Sign up to set email alerts
|

Detecting spelling variants in non-standard texts

Abstract: Spelling variation in non-standard language, e.g. computer-mediated communication and historical texts, is usually treated as a deviation from a standard spelling, e.g. 2mr as a non-standard spelling for tomorrow. Consequently, in normalization -the standard approach of dealing with spelling variation -so-called non-standard words are mapped to their corresponding standard words. However, there is not always a corresponding standard word. This can be the case for single types (like emoticons in computermediate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 34 publications
0
6
0
Order By: Relevance
“…Additionally, the erroneous variant pairs manually filtered out can be used as counter examples of variants, and further used to train variant classifiers (see, for instance, (Barteld, 2017)).…”
Section: Obtained Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, the erroneous variant pairs manually filtered out can be used as counter examples of variants, and further used to train variant classifiers (see, for instance, (Barteld, 2017)).…”
Section: Obtained Resultsmentioning
confidence: 99%
“…The closest work to ours is that described in (Barteld, 2017), as it focuses on detecting spelling variants in Middle Low German unrelated to a standard. Yet, the described method requires the training of a classifier to filter the generated pairs.…”
Section: Related Workmentioning
confidence: 99%
“…But given the fast-growing content on the web, the static dictionaries about Internet slang risk being outdated due to not keeping up with the current trends. It is possible to automatically detect such spelling variants (Barteld 2017), but this is an extensive research topic outside the text preprocessing scope.…”
Section: Social Media Datamentioning
confidence: 99%
“…three main issues. First, name mentions in dialogues are sparse (Azab et al, 2018), which makes it difficult for these models to learn a good quality representation for these names (Barteld, 2017). Second, in dialogues or narratives, names often do not refer to the same person, and yet these embeddings have a single vector representation for each word in the vocabulary.…”
Section: Introductionmentioning
confidence: 99%