Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.397
|View full text |Cite
|
Sign up to set email alerts
|

Assessing Phrasal Representation and Composition in Transformers

Abstract: Deep transformer models have pushed performance on NLP tasks to new limits, suggesting sophisticated treatment of complex linguistic inputs, such as phrases. However, we have limited understanding of how these models handle representation of phrases, and whether this reflects sophisticated composition of phrase meaning like that done by humans. In this paper, we present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. We use tests leveraging human judgments of phrase… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

5
72
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 46 publications
(89 citation statements)
references
References 36 publications
5
72
1
Order By: Relevance
“…Whether our models have learned to solve tasks in robust and generalizable ways has been a topic of much recent interest. Challenging test sets have shown that many state-of-the-art NLP models struggle with compositionality Kim and Linzen, 2020;Yu and Ettinger, 2020;White et al, 2020), and find it difficult to pass the myriad stress tests for social May et al, 2019;Nangia et al, 2020) and/or linguistic competencies (Geiger et al, 2018;Naik et al, 2018;Glockner et al, 2018;White et al, 2018;Warstadt et al, 2019;Gauthier et al, 2020;Hossain et al, 2020;Jeretic et al, 2020;Lewis et al, 2020;Saha et al, 2020;Schuster et al, 2020;Sugawara et al, 2020;. Yet, challenge sets may suffer from performance instability (Liu et al, 2019a;Rozen et al, 2019; and often lack sufficient statistical power (Card et al, 2020), suggesting that, although they may be valuable assessment tools, they are not sufficient for ensuring that our models have achieved the learning targets we set for them.…”
Section: Challenge Sets and Adversarial Settingsmentioning
confidence: 99%
“…Whether our models have learned to solve tasks in robust and generalizable ways has been a topic of much recent interest. Challenging test sets have shown that many state-of-the-art NLP models struggle with compositionality Kim and Linzen, 2020;Yu and Ettinger, 2020;White et al, 2020), and find it difficult to pass the myriad stress tests for social May et al, 2019;Nangia et al, 2020) and/or linguistic competencies (Geiger et al, 2018;Naik et al, 2018;Glockner et al, 2018;White et al, 2018;Warstadt et al, 2019;Gauthier et al, 2020;Hossain et al, 2020;Jeretic et al, 2020;Lewis et al, 2020;Saha et al, 2020;Schuster et al, 2020;Sugawara et al, 2020;. Yet, challenge sets may suffer from performance instability (Liu et al, 2019a;Rozen et al, 2019; and often lack sufficient statistical power (Card et al, 2020), suggesting that, although they may be valuable assessment tools, they are not sufficient for ensuring that our models have achieved the learning targets we set for them.…”
Section: Challenge Sets and Adversarial Settingsmentioning
confidence: 99%
“…Phrase and sentence composition has drawn frequent attention in analysis of neural models, often focusing on analysis of internal representations and downstream task behavior (Ettinger et al, 2018;Conneau et al, 2019;Nandakumar et al, 2019;Yu and Ettinger, 2020;Bhathena et al, 2020;Mu and Andreas, 2020;Andreas, 2019). Some work investigates compositionality via constructing linguistic (Keysers et al, 2019) and non-linguistic (Liška et al, 2018;Hupkes et al, 2018;Baan et al, 2019) and Ettinger (2020).…”
Section: Related Workmentioning
confidence: 99%
“…The versatility of these pre-trained models suggests that they may acquire fairly robust linguistic knowledge and capacity for natural language "understanding". However, an emerging body of analysis demonstrates a level of superficiality in these models' handling of language (Niven and Kao, 2019;Kim and Linzen, 2020;Ettinger, 2020;Yu and Ettinger, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations