Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing 2022
DOI: 10.18653/v1/2022.emnlp-main.746
|View full text |Cite
|
Sign up to set email alerts
|

The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative

Abstract: Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step towards assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-ofthe-art pretrained language models (PLMs), we present an investig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…Research into language models could potentially benefit from linguistic studies in CxG, as those generally pay attention to constructional meaning (syntax + semantics). To illustrate, we can consider work by Weissweiler et al (2022), who study the extent to which PLMs can capture the syntactic and semantic information associated with the English comparative correlative (CC), e.g. the better your syntax, the better your semantics or the more you read, the more you learn.…”
Section: Possible Linguistic Shortcomings Of Language Modelsmentioning
confidence: 99%
See 2 more Smart Citations
“…Research into language models could potentially benefit from linguistic studies in CxG, as those generally pay attention to constructional meaning (syntax + semantics). To illustrate, we can consider work by Weissweiler et al (2022), who study the extent to which PLMs can capture the syntactic and semantic information associated with the English comparative correlative (CC), e.g. the better your syntax, the better your semantics or the more you read, the more you learn.…”
Section: Possible Linguistic Shortcomings Of Language Modelsmentioning
confidence: 99%
“…The authors generate synthetic data by modifying the second part of the CC, that is, the part that comes after 'the X-er'. For example, sentences which are instances of the CC with the pattern 'the ADV-er the NUM NOUN VERB' (the harder the two cats fight) are reordered as 'the ADJ-er NUM VERB the NOUN" (the harder two fight the cats) to generate a false instance, i.e., one that is not an instance of the CC (Weissweiler et al 2022).…”
Section: Possible Linguistic Shortcomings Of Language Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…The default tool for the majority of NLP tasks is now de facto pretrained language models (PLMs; Devlin et al, 2019;Liu et al, 2019b;Radford et al, 2019;Brown et al, 2020;Clark et al, 2020;Raffel et al, 2020;Chowdhery et al, 2022;Hoffmann et al, 2022;Touvron et al, 2023, inter alia), which are trained using language modeling objectives on large text corpora. Despite the conceptual simplicity of language modeling, pretraining induces complex forms of linguistic knowledge in PLMs, at various levels (Rogers et al, 2020;Mahowald et al, 2023): morphological (Edmiston, 2020;Hofmann et al, 2020;Weissweiler et al, 2023), lexical (Ethayarajh, 2019;, syntactic (Hewitt and Manning, 2019;Jawahar et al, 2019;Wei et al, 2021;Weissweiler et al, 2022), and semantic (Wiedemann et al, 2019;Ettinger, 2020). This general linguistic knowledge is then (re-)shaped for concrete tasks via fine-tuning, i.e., supervised training on task-specific labeled data.…”
Section: Introductionmentioning
confidence: 99%
“…The content of this manuscript has been presented in part at the 2022 Conference on Empirical Methods in Natural Language Processing (Weissweiler et al, 2022 ), and the Construction Grammars and NLP (CxGs+NLP) Workshop at the Georgetown University Round Table on Linguistics (GURT; Weissweiler et al, 2023 ). We are very grateful to David Mortensen and Lori Levin for helpful discussions and comments.…”
mentioning
confidence: 99%