Factors Influencing the Surprising Instability of Word Embeddings

Wendlandt, Laura; Kummerfeld, Jonathan K.; Mihalcea, Rada

doi:10.18653/v1/n18-1190

Cited by 79 publications

(105 citation statements)

References 23 publications

Supporting

Mentioning

101

Contrasting

Order By: Relevance

“…The large drops in performance observed when using the CDE transformation is likely to relate 6 Due to their large vocabulary size, we were unable to run Thresholded-NNE experiments with word2vec embeddings. to the instability of nearest neighborhoods and the importance of locality in embedding learning (Wendlandt et al, 2018), although the effects of the autoencoder component also bear further investigation. By effectively increasing the size of the neighborhood considered, CDE adds additional sources of semantic noise.…”

Section: Analysis and Discussionmentioning

confidence: 99%

“…This transformation relates to the common use of nearest neighborhoods as a proxy for semantic information (Wendlandt et al, 2018;Pierrejean and Tanguy, 2018). We take the previously proposed approach of combining the output of f NNE (v) for each v ∈ V to form a sparse adjacency matrix, which describes a directed nearest neighbor graph (Cuba Gyllensten and Sahlgren, 2015; Newman-Griffis and Fosler-Lussier, 2017), using three versions of f NNE defined below.…”

Section: Nearest Neighbor Encoding (Nne)mentioning

confidence: 99%

“…2018; Pierrejean and Tanguy, 2018). However, neighborhood-based analysis is limited by the unreliability of nearest neighborhoods (Wendlandt et al, 2018). Further, it is intended to evaluate the semantic content of embedding spaces, as opposed to characteristics of the feature space itself.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance

Whitaker¹,

Newman-Griffis

Haldar

et al. 2019

Proceedings of the 3rd Workshop on Evaluating Vector Space Representations For

View full text Add to dashboard Cite

Analysis of word embedding properties to inform their use in downstream NLP tasks has largely been studied by assessing nearest neighbors. However, geometric properties of the continuous feature space contribute directly to the use of embedding features in downstream models, and are largely unexplored. We consider four properties of word embedding geometry, namely: position relative to the origin, distribution of features in the vector space, global pairwise distances, and local pairwise distances. We define a sequence of transformations to generate new embeddings that expose subsets of these properties to downstream models and evaluate change in task performance to understand the contribution of each property to NLP models. We transform publicly available pretrained embeddings from three popular toolkits (word2vec, GloVe, and FastText) and evaluate on a variety of intrinsic tasks, which model linguistic information in the vector space, and extrinsic tasks, which use vectors as input to machine learning models. We find that intrinsic evaluations are highly sensitive to absolute position, while extrinsic tasks rely primarily on local similarity. Our findings suggest that future embedding models and post-processing techniques should focus primarily on similarity to nearby points in vector space.

show abstract

Section: Analysis and Discussionmentioning

confidence: 99%

Section: Nearest Neighbor Encoding (Nne)mentioning

confidence: 99%

See 1 more Smart Citation

Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance

Whitaker¹,

Newman-Griffis

Haldar

et al. 2019

Proceedings of the 3rd Workshop on Evaluating Vector Space Representations For

View full text Add to dashboard Cite

show abstract

“…Prior research has noted instability of nearest neighborhoods in multiple embedding methods (Wendlandt et al, 2018). We therefore train 10 sets of embeddings from each of our subcorpora, each using the same hyperparameter settings but a different random seed.…”

Section: Identifying Concepts For Comparisonmentioning

confidence: 99%

Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

Newman-Griffis

Fosler‐Lussier

2019

Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

View full text Add to dashboard Cite

Natural language processing techniques are being applied to increasingly diverse types of electronic health records, and can benefit from in-depth understanding of the distinguishing characteristics of medical document types. We present a method for characterizing the usage patterns of clinical concepts among different document types, in order to capture semantic differences beyond the lexical level. By training concept embeddings on clinical documents of different types and measuring the differences in their nearest neighborhood structures, we are able to measure divergences in concept usage while correcting for noise in embedding learning. Experiments on the MIMIC-III corpus demonstrate that our approach captures clinically-relevant differences in concept usage and provides an intuitive way to explore semantic characteristics of clinical document collections.

show abstract

“…Stability can be quantified by calculating the overlap between sets of words considered most similar in relation to pre-selected anchor words. Reasonable metrical choices are, e.g., the Jaccard coefficient (Jaccard, 1912) between these sets (Antoniak and Mimno, 2018;Chugh et al, 2018), or a percentage based coefficient (Hellrich and Hahn, 2016a,b;Wendlandt et al, 2018;Pierrejean and Tanguy, 2018). We here use j@n, i.e., the Jaccard coefficient for the n most similar words.…”

Section: Measuring Stabilitymentioning

confidence: 99%

The Influence of Down-Sampling Strategies on

Hellrich

Kampe

Hahn

2019

Proceedings of the 3rd Workshop on Evaluating Vector Space Representations For

View full text Add to dashboard Cite

The stability of word embedding algorithms, i.e., the consistency of the word representations they reveal when trained repeatedly on the same data set, has recently raised concerns. We here compare word embedding algorithms on three corpora of different sizes, and evaluate both their stability and accuracy. We find strong evidence that down-sampling strategies (used as part of their training procedures) are particularly influential for the stability of SVD PPMI -type embeddings. This finding seems to explain diverging reports on their stability and lead us to a simple modification which provides superior stability as well as accuracy on par with skip-gram embeddings.

show abstract

Factors Influencing the Surprising Instability of Word Embeddings

Cited by 79 publications

References 23 publications

Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance

Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance

Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

The Influence of Down-Sampling Strategies on

Contact Info

Product

Resources

About