2022
DOI: 10.1007/s10579-021-09575-z
|View full text |Cite
|
Sign up to set email alerts
|

A comparative evaluation and analysis of three generations of Distributional Semantic Models

Abstract: Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 37 publications
(39 citation statements)
references
References 64 publications
0
24
0
Order By: Relevance
“…We extracted all semantic vectors from a stateof-the art model for representing lexical semantics (Lenci, Sahlgren, Jeuniaux, Gyllensten, & Miliani, 2021), which was identified as the best-performing distributional semantic model in the systematic evaluation study by Baroni, Dinu, and Kruszewski (2014): A cbow variant of the word2vec algorithm (Mikolov, Chen, Corrado, & Dean, 2013), aimed at predicting each target word in the corpus from the 5 words to its left and its right (i.e., window size 5), negative sampling with k = 10, and subsampling with t = 1e −5 . The model was trained on an English ∼ 2.8 billion word corpus (a concatenation of the ukWaC corpus, an English Wikipedia dump, and the British National Corpus).…”
Section: Semantic Vectorsmentioning
confidence: 99%
“…We extracted all semantic vectors from a stateof-the art model for representing lexical semantics (Lenci, Sahlgren, Jeuniaux, Gyllensten, & Miliani, 2021), which was identified as the best-performing distributional semantic model in the systematic evaluation study by Baroni, Dinu, and Kruszewski (2014): A cbow variant of the word2vec algorithm (Mikolov, Chen, Corrado, & Dean, 2013), aimed at predicting each target word in the corpus from the 5 words to its left and its right (i.e., window size 5), negative sampling with k = 10, and subsampling with t = 1e −5 . The model was trained on an English ∼ 2.8 billion word corpus (a concatenation of the ukWaC corpus, an English Wikipedia dump, and the British National Corpus).…”
Section: Semantic Vectorsmentioning
confidence: 99%
“…The bulk of the English lexicon is low in frequency (e.g., Webb & Rodgers, 2009), yet people struggle to process those low-frequency words (e.g., Monsell et al, 1989), as do state of the art computational models (e.g., Lenci et al, 2022). We have provided evidence that English speakers compensate for a lack of experience with words by recruiting similar-sounding words with related meanings.…”
Section: Discussionmentioning
confidence: 85%
“…Unlike previous lexical decision tasks that investigated words that overlap with other words in both form and meaning (e.g., Bergen, 2004;Pastizzo & Feldman, 2009), this semantic relatedness task requires word meaning access. Indeed, semantic relatedness tasks are often used to evaluate the performance of DSMs (e.g., Baroni, Dinu, & Kruszewski, 2014;Mandera et al, 2017;Lenci et al, 2022). Moreover, unlike those masked priming studies, participants in this experiment are never presented with the similar-sounding attractor words (e.g., avoid).…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations