Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1314
|View full text |Cite
|
Sign up to set email alerts
|

Imitation Learning for Neural Morphological String Transduction

Abstract: We employ imitation learning to train a neural transition-based string transducer for morphological tasks such as inflection generation and lemmatization. Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite spurious ambiguity, or require warm starting with an MLE model. Our approach only requires a simple expert p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
33
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 29 publications
(34 citation statements)
references
References 24 publications
1
33
0
Order By: Relevance
“…Thus, with an appropriate training method, the neural transition-based model can be very strong in the high data setting. This is in line with the results for the SIGMORPHON 2016 dataset in Makarov and Clematide (2018a). On the other hand, we expect that gains can be made with the general soft-attention seq2seq model (or any latent-alignment model), by applying the same training method or other existing alternatives (Edunov et al, 2017).…”
Section: Task I Results and Discussionsupporting
confidence: 86%
See 1 more Smart Citation
“…Thus, with an appropriate training method, the neural transition-based model can be very strong in the high data setting. This is in line with the results for the SIGMORPHON 2016 dataset in Makarov and Clematide (2018a). On the other hand, we expect that gains can be made with the general soft-attention seq2seq model (or any latent-alignment model), by applying the same training method or other existing alternatives (Edunov et al, 2017).…”
Section: Task I Results and Discussionsupporting
confidence: 86%
“…Typically, this model is trained by maximizing the likelihood of gold action sequences that come from a separate character aligner. This year, we train with an imitation learning method (Makarov and Clematide, 2018a) that enforces optimal alignment in the loss and additionally supports action-space exploration and the optimization of a task-specific objective. Our method entirely eliminates the need for a character aligner and results in substantially stronger models, at the expense of slight increase in training time.…”
Section: Introductionmentioning
confidence: 99%
“…The previous state-of-the-art model, Bergmanis et al (2017), is a non-monotonic system that outperformed the monotonic system of Makarov et al (2017). However, Makarov et al (2017) is a pipeline system that took alignments from an existing aligner; such a system has no manner, by which it can recover from poor initial Makarov et al (2017) 93.9 0-HARD 94.5 Bergmanis et al (2017) 94.6 Makarov and Clematide (2018) alignment. We show that jointly learning monotonic alignments lead to improved results.…”
Section: Experimental Findingsmentioning
confidence: 99%
“…Obtaining gold action sequences as a previous, independent step presents a drawback, as pointed out by Makarov and Clematide (2018a). The optimal action sequence obtained for certain wordlemma pair might not be unique.…”
Section: Fixed Gold Action Sequencesmentioning
confidence: 99%