Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1141
|View full text |Cite
|
Sign up to set email alerts
|

Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change

Abstract: Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec) against known historical changes. We then use this methodology to reveal statistical laws of semantic evolution. Usi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

15
789
3
3

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 626 publications
(876 citation statements)
references
References 43 publications
15
789
3
3
Order By: Relevance
“…In contrast to other corpus exploration tools, JESEME is based on cutting-edge word embedding technology (Levy et al, 2015;Hamilton et al, 2016;Hahn, 2016a, 2017) and provides access to five popular corpora for the English and German language. JESEME is also the first tool of its kind and under continuous development.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In contrast to other corpus exploration tools, JESEME is based on cutting-edge word embedding technology (Levy et al, 2015;Hamilton et al, 2016;Hahn, 2016a, 2017) and provides access to five popular corpora for the English and German language. JESEME is also the first tool of its kind and under continuous development.…”
Section: Resultsmentioning
confidence: 99%
“…Future technical work will add functionality to compare words across corpora which might require a mapping between embeddings (Kulkarni et al, 2015;Hamilton et al, 2016) and provide optional stemming routines. Both goals come with an increase in precomputed similarity values and will thus necessitate storage optimizations to ensure long-term availability.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Applied to semantic change, if you have at your disposal a bunch of diachronic corpora, you can build the semantic vectors of any lexical unit corresponding to several periods, and track the changes from one period to another. First experiments have been proposed by (Hamilton et al, 2016). The main advantage of this approach resides in the fact that it proposes for a given word a list of semantically similar words, among which synonyms and hypernyms, which permits to clearly explicit the meaning of a word.…”
Section: Semantic Neology Approachesmentioning
confidence: 99%
“…This is problematic when these models are used as input for information retrieval tasks, such as automatic event extraction, automatic summarization, etc. Based on this observation we propose a preliminary work close to those initiated in [1,7] that seeks to compare two word embedding-based models: one ignoring the temporal aspect of the messages, using the state-of-the-art Word2Vec model [9], and one taking advantage of the emission date of tweets, using the Temporal Embedding approach.…”
Section: Introductionmentioning
confidence: 99%