Findings of the Association for Computational Linguistics: ACL 2022 2022
DOI: 10.18653/v1/2022.findings-acl.262
|View full text |Cite
|
Sign up to set email alerts
|

IsoScore: Measuring the Uniformity of Embedding Space Utilization

Abstract: The recent success of distributed word representations has led to an increased interest in analyzing the properties of their spatial distribution. Several studies have suggested that contextualized word embedding models do not isotropically project tokens into vector space. However, current methods designed to measure isotropy, such as average random cosine similarity and the partition score, have not been thoroughly analyzed and are not appropriate for measuring isotropy. We propose IsoScore: a novel tool tha… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 7 publications
0
6
0
Order By: Relevance
“…Therefore, methods that are based on the Principal Component Analysis (PCA) are the most appropriate to find and study the most elongated directions of the space. We present the two most robust PCA-based methods to quantify the isotropy, mainly the explained variance ratio and the IsoScore as highlighted in [21].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, methods that are based on the Principal Component Analysis (PCA) are the most appropriate to find and study the most elongated directions of the space. We present the two most robust PCA-based methods to quantify the isotropy, mainly the explained variance ratio and the IsoScore as highlighted in [21].…”
Section: Methodsmentioning
confidence: 99%
“…The IsoScore [21] of an embedding space can be interpreted as the fraction of dimensions uniformly used by the embedding space. This score is derived from an isotropy defect that is calculated by computing the distance between the identity matrix and the normalized covariance matrix of the PCA-reoriented data.…”
Section: Isoscorementioning
confidence: 99%
See 1 more Smart Citation
“…For similarity metric, we experiment with cosine similarity, as well as a more recent approach -IsoScore (Rudman et al, 2022). In our experiments, we find that cosine similarity performs better overall, so the results reported in the paper are using cosine similarity.…”
Section: Stratified Example Retrieval For Supervisionmentioning
confidence: 96%
“…Isotropy is often estimated by different ways: singular value decomposition (SVD) (Biś et al, 2021;Gao et al, 2019;Liang et al, 2021;Wang et al, 2020a), intrinsic dimension (Cai et al, 2021), partition function (Arora et al, 2016;Mu and Viswanath, 2018), average cosine similarity (Ethayarajh, 2019). We chose the two firsts, along with IsoScore (Rudman et al, 2022) that complement themselves. We did not measure isotropy on models using embedding pooling, as it would be untractable considering the very large number of possible embeddings, and that the low frequency of the majority of them would result in unreliable results.…”
Section: Learned Embedding Spacesmentioning
confidence: 99%