Findings of the Association for Computational Linguistics: EMNLP 2021 2021
DOI: 10.18653/v1/2021.findings-emnlp.226
|View full text |Cite
|
Sign up to set email alerts
|

Patterns of Polysemy and Homonymy in Contextualised Language Models

Abstract: One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts. In this paper we investigate the extent to which the contextualised embeddings of word forms that display multiplicity of sense reflect traditional distinctions of polysemy and homonymy. To this end, we introduce an extended, human-annotated dataset of graded word sense similarity and co-predication acceptability, and evaluate how well the similarit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(16 citation statements)
references
References 42 publications
0
15
0
1
Order By: Relevance
“…To operationalize this notion of continuity, we used BERT (Devlin et al, 2018), a state-of-the-art neural language model (NLM). There is a growing body of literature using BERT and other NLMs as operationalizations of human lexical-semantic knowledge in general (Haber & Poesio, 2020a, 2020b; Li & Joanisse, 2021; Nair et al, 2020; Trott & Bergen, 2021), and to test Elman’s (2004; 2009) cues to meaning framework in particular (Li & Joanisse, 2021; Trott & Bergen, 2021). It is important to note that BERT (like most NLMs) is trained on linguistic input alone (Bender & Koller, 2020), and lacks access to any extralinguistic sources of information that humans might use to represent the meanings of a word, such as sensorimotor associations.…”
Section: Current Workmentioning
confidence: 99%
See 1 more Smart Citation
“…To operationalize this notion of continuity, we used BERT (Devlin et al, 2018), a state-of-the-art neural language model (NLM). There is a growing body of literature using BERT and other NLMs as operationalizations of human lexical-semantic knowledge in general (Haber & Poesio, 2020a, 2020b; Li & Joanisse, 2021; Nair et al, 2020; Trott & Bergen, 2021), and to test Elman’s (2004; 2009) cues to meaning framework in particular (Li & Joanisse, 2021; Trott & Bergen, 2021). It is important to note that BERT (like most NLMs) is trained on linguistic input alone (Bender & Koller, 2020), and lacks access to any extralinguistic sources of information that humans might use to represent the meanings of a word, such as sensorimotor associations.…”
Section: Current Workmentioning
confidence: 99%
“…These contextualized embeddings have been shown to improve performance on a number of downstream Natural Language Processing tasks involving lexical ambiguity, such as word sense disambiguation (Aina et al, 2019; Loureiro et al, 2020). Past work also suggests that BERT can be used to distinguish monosemous and polysemous words, or even polysemy and homonymy (Haber & Poesio, 2020a, 2020b; Nair et al, 2020; Soler & Apidianaki, 2021), and that BERT’s representations encode sense-like information (Karidi et al, 2021). Most relevantly for our purposes, BERT’s contextualized embeddings are well-suited for measuring contextual distance in a graded manner––given two contextualized embeddings of an ambiguous target word (e.g., for “marinated lamb ” and “friendly lamb ”), we can compute the cosine distance between those vectors, a metric often used to assess proximity in vector space 4 .…”
Section: Current Workmentioning
confidence: 99%
“…Having demonstrated the utility of the CS Norms on a small subset of English words, one obvious direction for future research would be to expand this dataset-including more words, more senses and sentences per word, a wider variety of sentences (i.e., both experimentally controlled and naturalistic sentences), and additional languages. Similarly, existing datasets on lexical ambiguity (Haber and Poesio, 2021;Karidi et al, 2021;Schlechtweg et al, 2021;Erk et al, 2013) could be augmented with sensorimotor judgments. Further, because the original RAW-C items were adapted from psycholinguistic studies (Trott and Bergen, 2021), those items might be skewed towards the phenomena those researchers were interested in; for example, it is possible that certain polysemous relationships (metaphor and metonymy) may be overrepre-sented.…”
Section: Limitationsmentioning
confidence: 99%
“…Despite the promise and early success of this approach, it faces a key limitation: resources like the LS Norms typically contain just a single set of judgments for each word. In practice, however, many words are ambiguous (Rodd et al, 2004;Haber and Poesio, 2021). In English, anywhere from 7% (Rodd et al, 2004) to 15% (Trott and Bergen, 2020) of words have multiple, unrelated meanings-and as many as 84% are polysemous, i.e., they have multiple, related meanings (Rodd et al, 2004).…”
Section: Introductionmentioning
confidence: 99%
“…These models have achieved a good level of performance in many natural language processing tasks (Devlin et al, 2018;Radford et al, 2018a;Liu et al, 2019b;Radford et al, 2018b;Lan et al, 2019). Among these models, transformerarchitecture models such as Bidirectional Encoder Representations from Transformer (BERT; Devlin et al, 2018) and Generative Pre-Training 2 (GPT-2; Radford et al, 2018b) shows the best performance for this task of polysemy interpretation (Haber and Poesio, 2021;Soler and Apidianaki, 2021;Yenicelik et al, 2020).…”
Section: Introductionmentioning
confidence: 99%