Imitation Learning for Neural Morphological String Transduction

Makarov, Peter; Clematide, Simon

doi:10.18653/v1/d18-1314

Cited by 29 publications

(34 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, with an appropriate training method, the neural transition-based model can be very strong in the high data setting. This is in line with the results for the SIGMORPHON 2016 dataset in Makarov and Clematide (2018a). On the other hand, we expect that gains can be made with the general soft-attention seq2seq model (or any latent-alignment model), by applying the same training method or other existing alternatives (Edunov et al, 2017).…”

Section: Task I Results and Discussionsupporting

confidence: 86%

“…Typically, this model is trained by maximizing the likelihood of gold action sequences that come from a separate character aligner. This year, we train with an imitation learning method (Makarov and Clematide, 2018a) that enforces optimal alignment in the loss and additionally supports action-space exploration and the optimization of a task-specific objective. Our method entirely eliminates the need for a character aligner and results in substantially stronger models, at the expense of slight increase in training time.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Untitled

Makarov

Clematide

2018

Proceedings of The

Self Cite

View full text Add to dashboard Cite

This paper presents the submissions by the University of Zurich to the CoNLL-SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection. Our system is based on the prior work on neural transitionbased transduction (Makarov and Clematide, 2018b; Aharoni and Goldberg, 2017). Unlike the prior work, we train the model in a fully end-to-end fashion-without the need for an external character aligner-within the framework of imitation learning. In the type-level morphological inflection generation challenge (Task I), our five-strong ensemble outperforms all competitors in all three data-size settings. In the token-level inflection generation challenge (Task II), our single model achieves the best results on three out of four sub-tasks that we have participated in.

show abstract

Section: Task I Results and Discussionsupporting

confidence: 86%

Section: Introductionmentioning

confidence: 99%

Untitled

Makarov

Clematide

2018

Proceedings of The

Self Cite

View full text Add to dashboard Cite

show abstract

“…The previous state-of-the-art model, Bergmanis et al (2017), is a non-monotonic system that outperformed the monotonic system of Makarov et al (2017). However, Makarov et al (2017) is a pipeline system that took alignments from an existing aligner; such a system has no manner, by which it can recover from poor initial Makarov et al (2017) 93.9 0-HARD 94.5 Bergmanis et al (2017) 94.6 Makarov and Clematide (2018) alignment. We show that jointly learning monotonic alignments lead to improved results.…”

Section: Experimental Findingsmentioning

confidence: 99%

Exact Hard Monotonic Attention for Character-Level Transduction

Cotterell

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Many common character-level, string-tostring transduction tasks, e.g. graphemeto-phoneme conversion and morphological inflection, consist almost exclusively of monotonic transduction. Neural sequence-tosequence models with soft attention, which are non-monotonic, often outperform popular monotonic models. In this work, we ask the following question: Is monotonicity really a helpful inductive bias in these tasks? We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns a latent alignment jointly while learning to transduce. With the help of dynamic programming, we are able to compute the exact marginalization over all monotonic alignments. Our models achieve state-of-the-art performance on morphological inflection. Furthermore, we find strong performance on two other character-level transduction tasks. Code is available at https://github.com/ shijie-wu/neural-transducer.1 The state of the art for morphological inflection is held by ensemble systems, much like parsing and other structured

show abstract

“…Obtaining gold action sequences as a previous, independent step presents a drawback, as pointed out by Makarov and Clematide (2018a). The optimal action sequence obtained for certain wordlemma pair might not be unique.…”

Section: Fixed Gold Action Sequencesmentioning

confidence: 99%

CUNI–Malta system at SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context: Operation-based word formation

Cardenas¹,

Borg²,

Zeman³

2019

Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

View full text Add to dashboard Cite

This paper presents the submission by the Charles University-University of Malta team to the SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context. We present a lemmatization model based on previous work on neural transducers (Makarov and Clematide, 2018b; Aharoni and Goldberg, 2016). The key difference is that our model transforms the whole word form in every step, instead of consuming it character by character. We propose a merging strategy inspired by Byte-Pair-Encoding that reduces the space of valid operations by merging frequent adjacent operations. The resulting operations not only encode the actions to be performed but the relative position in the word token and how characters need to be transformed. Our morphological tagger is a vanilla biLSTM tagger that operates over operation representations, encoding operations and words in a hierarchical manner. Even though relative performance according to metrics is below the baseline, experiments show that our models capture important associations between interpretable operation labels and fine-grained morpho-syntax labels.

show abstract

Imitation Learning for Neural Morphological String Transduction

Cited by 29 publications

References 24 publications

Untitled

Untitled

Exact Hard Monotonic Attention for Character-Level Transduction

CUNI–Malta system at SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context: Operation-based word formation

Contact Info

Product

Resources

About