The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative

Leonie, Weissweiler,; Hofmann, Valentin; Köksal, Abdullatif; Schütze, Hinrich

doi:10.18653/v1/2022.emnlp-main.746

Cited by 6 publications

(5 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Research into language models could potentially benefit from linguistic studies in CxG, as those generally pay attention to constructional meaning (syntax + semantics). To illustrate, we can consider work by Weissweiler et al (2022), who study the extent to which PLMs can capture the syntactic and semantic information associated with the English comparative correlative (CC), e.g. the better your syntax, the better your semantics or the more you read, the more you learn.…”

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

“…The authors generate synthetic data by modifying the second part of the CC, that is, the part that comes after 'the X-er'. For example, sentences which are instances of the CC with the pattern 'the ADV-er the NUM NOUN VERB' (the harder the two cats fight) are reordered as 'the ADJ-er NUM VERB the NOUN" (the harder two fight the cats) to generate a false instance, i.e., one that is not an instance of the CC (Weissweiler et al 2022).…”

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

“…Another potential shortcoming of the study conducted by Weissweiler et al (2022) is that it employs relatively small models for experimentation. Recent research has demonstrated that the emergence of reasoning capabilities in PLMs occurs on a significantly larger scale, typically around 80 billion parameters (Wei et al 2022).…”

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

See 2 more Smart Citations

CxGBERT: BERT meets Construction Grammar

Madabushi

Romain²,

Divjak³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

While lexico-semantic elements no doubt capture a large amount of linguistic information, it has been argued that they do not capture all information contained in text. This assumption is central to constructionist approaches to language which argue that language consists of constructions, learned pairings of a form and a function or meaning that are either frequent or have a meaning that cannot be predicted from its component parts. BERT's training objectives give it access to a tremendous amount of lexico-semantic information, and while BERTology has shown that BERT captures certain important linguistic dimensions, there have been no studies exploring the extent to which BERT might have access to constructional information. In this work we design several probes and conduct extensive experiments to answer this question. Our results allow us to conclude that BERT does indeed have access to a significant amount of information, much of which linguists typically call constructional information. The impact of this observation is potentially far-reaching as it provides insights into what deep learning methods learn from text, while also showing that information contained in constructions is redundantly encoded in lexicosemantics.

show abstract

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

Section: Possible Linguistic Shortcomings Of Language Modelsmentioning

confidence: 99%

See 1 more Smart Citation

CxGBERT: BERT meets Construction Grammar

Madabushi

Romain²,

Divjak³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…The default tool for the majority of NLP tasks is now de facto pretrained language models (PLMs; Devlin et al, 2019;Liu et al, 2019b;Radford et al, 2019;Brown et al, 2020;Clark et al, 2020;Raffel et al, 2020;Chowdhery et al, 2022;Hoffmann et al, 2022;Touvron et al, 2023, inter alia), which are trained using language modeling objectives on large text corpora. Despite the conceptual simplicity of language modeling, pretraining induces complex forms of linguistic knowledge in PLMs, at various levels (Rogers et al, 2020;Mahowald et al, 2023): morphological (Edmiston, 2020;Hofmann et al, 2020;Weissweiler et al, 2023), lexical (Ethayarajh, 2019;, syntactic (Hewitt and Manning, 2019;Jawahar et al, 2019;Wei et al, 2021;Weissweiler et al, 2022), and semantic (Wiedemann et al, 2019;Ettinger, 2020). This general linguistic knowledge is then (re-)shaped for concrete tasks via fine-tuning, i.e., supervised training on task-specific labeled data.…”

Section: Introductionmentioning

confidence: 99%

Geographic Adaptation of Pretrained Language Models

Hofmann,

Glavaš,

Ljubešić

et al. 2024

Transactions of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

While pretrained language models (PLMs) have been shown to possess a plethora of linguistic knowledge, the existing body of research has largely neglected extralinguistic knowledge, which is generally difficult to obtain by pretraining on text alone. Here, we contribute to closing this gap by examining geolinguistic knowledge, i.e., knowledge about geographic variation in language. We introduce geoadaptation, an intermediate training step that couples language modeling with geolocation prediction in a multi-task learning setup. We geoadapt four PLMs, covering language groups from three geographic areas, and evaluate them on five different tasks: fine-tuned (i.e., supervised) geolocation prediction, zero-shot (i.e., unsupervised) geolocation prediction, fine-tuned language identification, zero-shot language identification, and zero-shot prediction of dialect features. Geoadaptation is very successful at injecting geolinguistic knowledge into the PLMs: The geoadapted PLMs consistently outperform PLMs adapted using only language modeling (by especially wide margins on zero-shot prediction tasks), and we obtain new state-of-the-art results on two benchmarks for geolocation prediction and language identification. Furthermore, we show that the effectiveness of geoadaptation stems from its ability to geographically retrofit the representation space of the PLMs.

show abstract

“…The content of this manuscript has been presented in part at the 2022 Conference on Empirical Methods in Natural Language Processing (Weissweiler et al, 2022 ), and the Construction Grammars and NLP (CxGs+NLP) Workshop at the Georgetown University Round Table on Linguistics (GURT; Weissweiler et al, 2023 ). We are very grateful to David Mortensen and Lori Levin for helpful discussions and comments.…”

mentioning

confidence: 99%

Explaining pretrained language models' understanding of linguistic structures using construction grammar

Weissweiler,

Hofmann,

Köksal

et al. 2023

Front. Artif. Intell.

Self Cite

View full text Add to dashboard Cite

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step toward assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behavior in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs, as well as OPT, are able to recognize the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.

show abstract

The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative

Cited by 6 publications

References 30 publications

CxGBERT: BERT meets Construction Grammar

CxGBERT: BERT meets Construction Grammar

Geographic Adaptation of Pretrained Language Models

Explaining pretrained language models' understanding of linguistic structures using construction grammar

Contact Info

Product

Resources

About