Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1072
|View full text |Cite
|
Sign up to set email alerts
|

A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains

Abstract: We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains. Our work addresses the superficialness and lack of comparison in assessing models of diachronic lexical change, by bringing together and extending benchmark models on a common state-of-the-art evaluation task. In addition, we demonstrate that the same evaluation task and modelling approaches can… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
73
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 67 publications
(74 citation statements)
references
References 46 publications
0
73
0
1
Order By: Relevance
“…18 Applying dimension-wise mean centering has the effect of spreading the vectors across the hyperplane and mitigating the hubness issue, which consequently improves wordlevel similarity, as it emerges from the reported results. Previous work has already validated the importance of mean centering for clustering-based tasks (Suzuki et al 2013), bilingual lexicon induction with crosslingual word embeddings (Artetxe, Labaka, and Agirre 2018a;, and for modeling lexical semantic change (Schlechtweg et al 2019). However, to the best of our knowledge, the results summarized in Table 12 are the first evidence that also confirms its importance for semantic similarity in a wide array of languages.…”
Section: Resultsmentioning
confidence: 51%
“…18 Applying dimension-wise mean centering has the effect of spreading the vectors across the hyperplane and mitigating the hubness issue, which consequently improves wordlevel similarity, as it emerges from the reported results. Previous work has already validated the importance of mean centering for clustering-based tasks (Suzuki et al 2013), bilingual lexicon induction with crosslingual word embeddings (Artetxe, Labaka, and Agirre 2018a;, and for modeling lexical semantic change (Schlechtweg et al 2019). However, to the best of our knowledge, the results summarized in Table 12 are the first evidence that also confirms its importance for semantic similarity in a wide array of languages.…”
Section: Resultsmentioning
confidence: 51%
“…Using different base embeddings, SGNS (Bamler and Mandt, 2017), PPMI (Yao et al, 2018), and Bernoulli embeddings (Rudolph and Blei, 2018), the results show that sharing data is beneficial regardless of the method. 1 Temporal Referencing has been applied first in the field of term extraction Ferrari et al (2017) and recently been tested for diachronic LSC detection (Schlechtweg et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…The successful outcome of semantic change detection is relevant to any diachronic textual analysis, including machine translation or normalization of historical texts (Tjong Kim Sang et al, 2017), the detection of cultural semantic shifts (Kutuzov et al, 2017) or applications in digital humanities (Tahmasebi and Risse, 2017a). However, currently, the best-performing models (Hamilton et al, 2016b;Kulkarni et al, 2015;Schlechtweg et al, 2019) require a complex alignment procedure and have been shown to suffer from biases (Dubossarsky et al, 2017). This exposes them to various sources of noise influencing their predictions; a fact which has long gone unnoticed because of the lack of standard evaluation procedures in the field.…”
Section: Introductionmentioning
confidence: 99%
“…All corpora are lemmatized and POS-tagged with the TreeTagger (Schmid, 1995), and reduced to content words (nouns, verbs and adjectives). We follow the preprocessing steps described in Schlechtweg et al (2019) that led to the best results in that study. The corpus sizes are shown in Table 1.…”
Section: Data and Gold Standard Creationmentioning
confidence: 99%