Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1376
|View full text |Cite
|
Sign up to set email alerts
|

Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection

Abstract: The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in thi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 26 publications
(26 citation statements)
references
References 23 publications
(33 reference statements)
0
26
0
Order By: Relevance
“…When given the task of predicting the judgement and production data from Albright and Hayes’ (2003) novel verb study, each of these exemplar models – depending on the particular instantiation – equals or betters both a state-of-the-art connectionist model (cf. Chandler, 2010, Table 1; Kirov & Cotterell, 2018, Table 5; though see Corkery, Matusevych, & Goldwater, 2019, for concerns regarding the stability of these simulations) and Albright and Hayes’ (2003) own model which constructs an explicit micro-rule for each and every sub regularity; an approach which shows no regard for psychological plausibility, in contrast to many exemplar models which have their origins in models and findings from the non-linguistic categorization literature.…”
Section: Morphologically Inflected Wordsmentioning
confidence: 99%
“…When given the task of predicting the judgement and production data from Albright and Hayes’ (2003) novel verb study, each of these exemplar models – depending on the particular instantiation – equals or betters both a state-of-the-art connectionist model (cf. Chandler, 2010, Table 1; Kirov & Cotterell, 2018, Table 5; though see Corkery, Matusevych, & Goldwater, 2019, for concerns regarding the stability of these simulations) and Albright and Hayes’ (2003) own model which constructs an explicit micro-rule for each and every sub regularity; an approach which shows no regard for psychological plausibility, in contrast to many exemplar models which have their origins in models and findings from the non-linguistic categorization literature.…”
Section: Morphologically Inflected Wordsmentioning
confidence: 99%
“…This neural network architecture was originally designed for machine translation, but has been proposed as a baseline for morphophonological learning, and correlates well with human behaviour in a number of such tasks (Kirov 2017). For example, when tested on the experimental results from Albright & Hayes (2003), a Seq2Seq model's predictions correlated with human behaviour better than any previously proposed model (Kirov & Cotterell 2018; although see Corkery et al 2019 for a critique of these results). The Seq2Seq network learns string-to-string mappings (UR to SR mappings in this case) by updating weights for connections between nodes that are organised into multiple layers.…”
Section: Learning Simulationsmentioning
confidence: 96%
“…Sequenceto-sequence models are also capable of learning stem-affix relationships, both morphological and phonological, as discussed in Section 3. Faruqui et al (2016) illustrates this for the case of Finnish vowel harmony (see also Corkery et al 2019); many earlier models with explicit morpheme segmentation were forced to represent this process as suppletive allomorphy (for example, positing the two phonological [ 71 ] variants of the inessive suffix, -ssa and -ssä, as suppletive allomorphs), which could lead to overassessment of the system's complexity (Stump and Finkel 2015). But the sequence-to-sequence model learns a generalizable harmony rule.…”
Section: 1mentioning
confidence: 99%
“…We address these questions below. acquisition Cotterell et al (2018b) suggests that sequence-to-sequence models can function as cognitive models of infant language learners (though see Corkery et al (2019) for some differences in behavior for nonce words). But to use a sequence-to-sequence model as a credible stand-in for the human infant, we must determine what the input for acquisition of morphology looks like -the right representation and learning algorithm cannot tell us anything if it is supplied with the wrong data.…”
Section: Modeling Morphological Learningmentioning
confidence: 99%