2020
DOI: 10.3389/fams.2020.593406
|View full text |Cite
|
Sign up to set email alerts
|

The Spectral Underpinning of word2vec

Abstract: Word2vec introduced by Mikolov et al. is a word embedding method that is widely used in natural language processing. Despite its success and frequent use, a strong theoretical justification is still lacking. The main contribution of our paper is to propose a rigorous analysis of the highly nonlinear functional of word2vec. Our results suggest that word2vec may be primarily driven by an underlying spectral method. This insight may open the door to obtaining provable guarantees for word2vec. We support these fin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Furthermore, during the COVID-19 pandemic, Twitter has become a place for people to express their opinions, thoughts, and feelings via tweets [5]. A phrase Cite this article as: A. N. Sutranggono and E. M. Imah, "Tweets Emotions Analysis of Community Activities Restriction as COVID-19 Policy in Indonesia Using Support Vector Machine", CommIT Journal 17 (1), [13][14][15][16][17][18][19][20][21][22][23][24][25]2023. conveyed through a tweet can reflect how a person's feelings or emotions are experienced.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, during the COVID-19 pandemic, Twitter has become a place for people to express their opinions, thoughts, and feelings via tweets [5]. A phrase Cite this article as: A. N. Sutranggono and E. M. Imah, "Tweets Emotions Analysis of Community Activities Restriction as COVID-19 Policy in Indonesia Using Support Vector Machine", CommIT Journal 17 (1), [13][14][15][16][17][18][19][20][21][22][23][24][25]2023. conveyed through a tweet can reflect how a person's feelings or emotions are experienced.…”
mentioning
confidence: 99%
“…The word embedding method is one of the feature engineering methods commonly used in Natural Language Processing (NLP). Each word in the text can be represented numerically based on text corpora using the word2vec embedding method [16]. Nevertheless, this method has a limitation.…”
mentioning
confidence: 99%
“…For this proof, we derive a relation between latent tree models and a classic result from spectral graph theory known as Fiedler's theorem of nodal domains [18]. This theorem is important in various learning tasks such as clustering data [50], graph partitioning [11], and low dimensional embeddings [25]. To the best of our knowledge, this is the first guarantee derived for spectral partitioning in the setting of latent tree models.…”
Section: Contributions and Outlinementioning
confidence: 99%