Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2016
DOI: 10.18653/v1/n16-1080
|View full text |Cite
|
Sign up to set email alerts
|

A Joint Model of Orthography and Morphological Segmentation

Abstract: We present a model of morphological segmentation that jointly learns to segment and restore orthographic changes, e.g., funniest → fun-y-est. We term this form of analysis canonical segmentation and contrast it with the traditional surface segmentation, which segments a surface form into a sequence of substrings, e.g., funniest → funn-i-est. We derive an importance sampling algorithm for approximate inference in the model and report experimental results on English, German and Indonesian.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
49
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 27 publications
(53 citation statements)
references
References 25 publications
0
49
0
Order By: Relevance
“…Indonesian -or Bahasa Indonesia-is the official language of Indonesia. Cotterell et al (2016) report the best experimental results for Indonesian, followed by English and finally German. The high error rate for German might be caused by it being rich in orthografic changes.…”
Section: Languagesmentioning
confidence: 94%
See 4 more Smart Citations
“…Indonesian -or Bahasa Indonesia-is the official language of Indonesia. Cotterell et al (2016) report the best experimental results for Indonesian, followed by English and finally German. The high error rate for German might be caused by it being rich in orthografic changes.…”
Section: Languagesmentioning
confidence: 94%
“…To enable comparison to earlier work, we use a dataset that was prepared by Cotterell et al (2016) for canonical segmentation. 3…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations