2022
DOI: 10.3389/frai.2022.813967
|View full text |Cite
|
Sign up to set email alerts
|

Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification

Abstract: This article examines the basis of Natural Language Understanding of transformer based language models, such as BERT. It does this through a case study on idiom token classification. We use idiom token identification as a basis for our analysis because of the variety of information types that have previously been explored in the literature for this task, including: topic, lexical, and syntactic features. This variety of relevant information types means that the task of idiom token identification enables us to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 27 publications
0
5
0
Order By: Relevance
“…From the results in Section 2 and Section 5, it is evident that the meanings of IEs cannot be learned from general corpora alone (even when there is a collection of sentences with IEs), rather, external knowledge (e.g., IE definitions) is a fundamental to providing the strong supervising signal (i.e., similarity forcing loss) needed for training. Taking this into consideration, we believe that it is impractical to generalize the representation ability to the unseen idioms because (1) intuitively, each IE has a unique origin, metaphorical linkage, and interpretation, so, the meaning of IEs have to be learned on a case-by-case basis; and (2) from our error analysis, even with the same training data and objective, the learning difficulty is highly idiom dependent, a point that is also corroborated by Nedumpozhimana et al (2022). Therefore, we do not currently see a practical way to generalize GIEA to idioms that are unseen.…”
Section: Mpnet Vs Bart For Definition Embeddingmentioning
confidence: 53%
“…From the results in Section 2 and Section 5, it is evident that the meanings of IEs cannot be learned from general corpora alone (even when there is a collection of sentences with IEs), rather, external knowledge (e.g., IE definitions) is a fundamental to providing the strong supervising signal (i.e., similarity forcing loss) needed for training. Taking this into consideration, we believe that it is impractical to generalize the representation ability to the unseen idioms because (1) intuitively, each IE has a unique origin, metaphorical linkage, and interpretation, so, the meaning of IEs have to be learned on a case-by-case basis; and (2) from our error analysis, even with the same training data and objective, the learning difficulty is highly idiom dependent, a point that is also corroborated by Nedumpozhimana et al (2022). Therefore, we do not currently see a practical way to generalize GIEA to idioms that are unseen.…”
Section: Mpnet Vs Bart For Definition Embeddingmentioning
confidence: 53%
“…Similarly, probing experiments on idiomaticity classification indicate that BERT relies on information localized mainly in the idiomatic expression itself, but also in the surrounding context (Nedumpozhimana and Kelleher, 2021). This is indirectly echoed by better performance on individual items whose topic distribution is similar to that of the full dataset (Nedumpozhimana et al, 2022).…”
Section: Contextual Informationmentioning
confidence: 95%
“…From a dataset perspective, a BERT-based idiomaticity classifier reaches generalizability faster if the idioms to which it is exposed during training are ordered by decreasing contribution to model performance. The contribution is determined by an idiom's Shapley value, estimated as the difference between the average performance of multiple models which do vs. do not include a given idiom in training data (Nedumpozhimana et al, 2022). From an architecture perspective, the previously discussed use of attention flow to fuse contextualized and static idiom representations is especially beneficial for generalization to unseen idioms and to other domains (Zeng and Bhat, 2021).…”
Section: Memorization and Generalizationmentioning
confidence: 99%
See 2 more Smart Citations