Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 2021
DOI: 10.18653/v1/2021.findings-acl.439
|View full text |Cite
|
Sign up to set email alerts
|

Language Models Use Monotonicity to Assess NPI Licensing

Abstract: We investigate the semantic knowledge of language models (LMs), focusing on (1) whether these LMs create categories of linguistic environments based on their semantic monotonicity properties, and (2) whether these categories play a similar role in LMs as in human language understanding, using negative polarity item licensing as a case study. We introduce a series of experiments consisting of probing with diagnostic classifiers (DCs), linguistic acceptability tasks, as well as a novel DC ranking method that tig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 37 publications
(38 reference statements)
0
9
0
Order By: Relevance
“…On the utility of probing tasks. Many recent papers provide compelling evidence that BERT contains a surprising amount of syntax, semantics, and world knowledge (Giulianelli et al, 2018;Rogers et al, 2020;Lakretz et al, 2019;Jumelet et al, 2019Jumelet et al, , 2021. Many of these works involve diagnostic classifiers or parametric probes, i.e.…”
Section: Related Workmentioning
confidence: 99%
“…On the utility of probing tasks. Many recent papers provide compelling evidence that BERT contains a surprising amount of syntax, semantics, and world knowledge (Giulianelli et al, 2018;Rogers et al, 2020;Lakretz et al, 2019;Jumelet et al, 2019Jumelet et al, , 2021. Many of these works involve diagnostic classifiers or parametric probes, i.e.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the existing methods inspect a pre-specified model component (e.g., individual BERT layers) in a top-down manner. A typical approach first takes aim at specific linguistic phenomena that would be captured by the target components, and then trains a probing classifier that predicts the chosen linguistic phenomena from the target components (Bau et al, 2018;Giulianelli et al, 2018;Dalvi et al, 2019;Lakretz et al, 2019;Kovaleva et al, 2019;Goldberg, 2019;Petroni et al, 2019;Hewitt and Manning, 2019;Jawahar et al, 2019;Durrani et al, 2020;Zhou and Srikumar, 2021;Cao et al, 2021;Jumelet et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…A range of tests for causal language models consider if a model can represent a particular linguistic phenomenon (i.e., subject-verb-agreement, filler gap dependencies, negative polarity items Jumelet et al, 2021Jumelet et al, , 2019Wilcox et al, 2018;Gulordava et al, 2018), by measuring whether that model assigns a higher probability to a grammatical sentence involving that phenomenon than to its minimally different ungrammatical counterpart. In such tests, the comparison of probabilities is often focused on the probability of a single token -for instance, the probability of the correct and incorrect verb-form in a long sentence (Linzen et al, 2016).…”
Section: Methodsmentioning
confidence: 99%