Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification

Nedumpozhimana, Vasudevan; Klubička, Filip; Kelleher, John D.

doi:10.3389/frai.2022.813967

Cited by 4 publications

(7 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…From the results in Section 2 and Section 5, it is evident that the meanings of IEs cannot be learned from general corpora alone (even when there is a collection of sentences with IEs), rather, external knowledge (e.g., IE definitions) is a fundamental to providing the strong supervising signal (i.e., similarity forcing loss) needed for training. Taking this into consideration, we believe that it is impractical to generalize the representation ability to the unseen idioms because (1) intuitively, each IE has a unique origin, metaphorical linkage, and interpretation, so, the meaning of IEs have to be learned on a case-by-case basis; and (2) from our error analysis, even with the same training data and objective, the learning difficulty is highly idiom dependent, a point that is also corroborated by Nedumpozhimana et al (2022). Therefore, we do not currently see a practical way to generalize GIEA to idioms that are unseen.…”

Section: Mpnet Vs Bart For Definition Embeddingmentioning

confidence: 53%

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Zeng

Bhat

2022

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today’s state-of-the-art. Prior work has identified deficiencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25% higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.

show abstract

Section: Mpnet Vs Bart For Definition Embeddingmentioning

confidence: 53%

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Zeng

Bhat

2022

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…Similarly, probing experiments on idiomaticity classification indicate that BERT relies on information localized mainly in the idiomatic expression itself, but also in the surrounding context (Nedumpozhimana and Kelleher, 2021). This is indirectly echoed by better performance on individual items whose topic distribution is similar to that of the full dataset (Nedumpozhimana et al, 2022).…”

Section: Contextual Informationmentioning

confidence: 95%

“…From a dataset perspective, a BERT-based idiomaticity classifier reaches generalizability faster if the idioms to which it is exposed during training are ordered by decreasing contribution to model performance. The contribution is determined by an idiom's Shapley value, estimated as the difference between the average performance of multiple models which do vs. do not include a given idiom in training data (Nedumpozhimana et al, 2022). From an architecture perspective, the previously discussed use of attention flow to fuse contextualized and static idiom representations is especially beneficial for generalization to unseen idioms and to other domains (Zeng and Bhat, 2021).…”

Section: Memorization and Generalizationmentioning

confidence: 99%

“…When selecting the layers to represent MWE meanings, a common choice is the last layer, both as input to a classifier (Nedumpozhimana and Kelleher, 2021;Nedumpozhimana et al, 2022) and as a standalone representation, often constituting a baseline for an optimized model (Wang et al, 2021;Pham et al, 2023). Representations pooled over the last four layers have also been used with variable degrees of success (Gamallo et al, 2021;Garcia et al, 2021a).…”

Section: Layersmentioning

confidence: 99%

“…Individual expressions vary in terms of their inherent predictive properties, as shown in work on the usefulness of individual idioms when training an idiomaticity classifier. Nedumpozhimana et al (2022) note a positive effect of informativeness, measured by training a classifier on one idiom and evaluating it on the full set of idioms; and ease of prediction, measured by training a classifier on the full set of idioms and evaluating it on one idiom.…”

Section: Impact Of Linguistic Propertiesmentioning

confidence: 99%

See 2 more Smart Citations

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Miletić,

Walde

2024

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

Multiword expressions (MWEs) are composed of multiple words and exhibit variable degrees of compositionality. As such, their meanings are notoriously difficult to model, and it is unclear to what extent this issue affects transformer architectures. Addressing this gap, we provide the first in-depth survey of MWE processing with transformer models. We overall find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information. MWE meaning is also strongly localized, predominantly in early layers of the architecture. Representations benefit from specific linguistic properties, such as lower semantic idiosyncrasy and ambiguity of target expressions. Our findings overall question the ability of transformer models to robustly capture fine-grained semantics. Furthermore, we highlight the need for more directly comparable evaluation setups.

show abstract

Local or Global: The Variation in the Encoding of Style Across Sentiment and Formality

Jafaritazehjani,

Lecorvé,

Lolive

et al. 2023

Artificial Neural Networks and Machine Learning – ICANN 2023

View full text Add to dashboard Cite

Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification

Cited by 4 publications

References 27 publications

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Local or Global: The Variation in the Encoding of Style Across Sentiment and Formality

Contact Info

Product

Resources

About