Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.73
|View full text |Cite
|
Sign up to set email alerts
|

A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space

Abstract: The representation degeneration problem in Contextual Word Representations (CWRs) hurts the expressiveness of the embedding space by forming an anisotropic cone where even unrelated words have excessively positive correlations. Existing techniques for tackling this issue require a learning process to re-train models with additional objectives and mostly employ a global assessment to study isotropy. Our quantitative analysis over isotropy shows that a local assessment could be more accurate due to the clustered… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(19 citation statements)
references
References 17 publications
(20 reference statements)
1
15
0
Order By: Relevance
“…Cluster-based app. Based on the clustered structure of pre-trained LMs Reif et al, 2019), this method can significantly improve the performance of contextual embedding spaces as well as their isotropy (Rajaee and Pilehvar, 2021).…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Cluster-based app. Based on the clustered structure of pre-trained LMs Reif et al, 2019), this method can significantly improve the performance of contextual embedding spaces as well as their isotropy (Rajaee and Pilehvar, 2021).…”
Section: Methodsmentioning
confidence: 99%
“…To answer these questions, we consider the semantic textual similarity (STS) as the target task and leverage the metric proposed by Mu and Viswanath (2018) for measuring isotropy. The pretrained BERT and RoBERTa (Liu et al, 2019b) underperform static embeddings on STS, while fine-tuning significantly boosts their performance, suggesting the considerable change that CWRs un-dergo during fine-tuning (Reimers and Gurevych, 2019;Rajaee and Pilehvar, 2021).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The same metric is used for measuring isotropy of contextual word representations byRajaee and Pilehvar (2021).8 We randomly sample 10k sentences from English Wikipedia as V. We compute the average word-in-context embeddings for all words in each sentence and then compute the IS value. We repeat the process for five times to reduce the randomness introduced in sampling.…”
mentioning
confidence: 99%