Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers) 2017
DOI: 10.18653/v1/p17-1080
|View full text |Cite
|
Sign up to set email alerts
|

What do Neural Machine Translation Models Learn about Morphology?

Abstract: Neural machine translation (MT) models obtain state-of-the-art performance while maintaining a simple, end-to-end architecture. However, little is known about what these models learn about source and target languages during the training process.In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks. We conduct a t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

10
257
2

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 238 publications
(282 citation statements)
references
References 31 publications
(38 reference statements)
10
257
2
Order By: Relevance
“…Vinyals et al, 2015), and social cues available to humans (see Box 3). Despite the limitation of the training set and objective function, surprisingly, models of this kind (e.g., Devlin et al, 2018) may also implicitly learn some compositional properties of language, such as syntax, from the structure of the input (Linzen et al, 2016;Belinkov et al, 2017;Baroni, 2019;Hewitt & Manning, 2019).…”
Section: Language Model (Gpt-2)mentioning
confidence: 99%
“…Vinyals et al, 2015), and social cues available to humans (see Box 3). Despite the limitation of the training set and objective function, surprisingly, models of this kind (e.g., Devlin et al, 2018) may also implicitly learn some compositional properties of language, such as syntax, from the structure of the input (Linzen et al, 2016;Belinkov et al, 2017;Baroni, 2019;Hewitt & Manning, 2019).…”
Section: Language Model (Gpt-2)mentioning
confidence: 99%
“…Several authors have proposed convolutional neural networks over character sequences, as part of models of part of speech tagging (Santos and Zadrozny, 2014), named entity recognition (Ma and Hovy, 2016;Chiu and Nichols, 2015), language (Kim et al, 2015) and machine translation (Costa-jussà and Fonollosa, 2016;Belinkov et al, 2017). The latter one presents an in-depth analysis of representations learned by neural MT models.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, it is crucial for trainers to understand whether a model learns a good representation of the data as a secondary effect of the training, and to detect potential biases or origins of errors in a model [9]. To address this issue, many modelunderstanding techniques aim to visualize or analyze learned global features of a model [8,12,63,90].…”
Section: Passive Observationmentioning
confidence: 99%