“…Hence, it might not be optimal for contextual embeddings, especially in the light that the latter tends to have a clustered structure. For instance, recent work suggests that word types (e.g., verbs, nouns, punctuations), entities (e.g., personhood, nationalities, and dates), and even word senses (Michael et al, 2020;Loureiro et al, 2021;Reif et al, 2019) create local distinct clustered areas in the contextual embedding space. Moreover, our local assessment shows that it is not necessarily the case that all clusters share the same dominant directions.…”