2020
DOI: 10.1162/tacl_a_00326
|View full text |Cite
|
Sign up to set email alerts
|

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

Abstract: We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated by a hidden semantic vector encoding its contextual semantic meaning, and its context words are generated conditional on both the hidden semantic vector and global latent topics. Topics are trained jointly with the word embeddings. The trained model maps words to topic-depend… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…Thompson and Mimno (2020) showed that clustering the contextual representations of a given set of words can produce clusters of semantically related words, which were found to be similar in spirit to LDA topics. The idea of learning topic-specific representations of words has been extensively studied in the context of standard word embeddings (Liu et al, 2015;Li et al, 2016;Shi et al, 2017;Zhu et al, 2020). To the best of our knowledge, learning topic-specific word representations using CLMs has not yet been studied.…”
Section: Related Workmentioning
confidence: 99%
“…Thompson and Mimno (2020) showed that clustering the contextual representations of a given set of words can produce clusters of semantically related words, which were found to be similar in spirit to LDA topics. The idea of learning topic-specific representations of words has been extensively studied in the context of standard word embeddings (Liu et al, 2015;Li et al, 2016;Shi et al, 2017;Zhu et al, 2020). To the best of our knowledge, learning topic-specific word representations using CLMs has not yet been studied.…”
Section: Related Workmentioning
confidence: 99%
“…Some very recent works use pre-trained BERT (Devlin et al, 2019) either to leverage improved text representations (Bianchi et al, 2020;Sia et al, 2020) or to augment topic model through knowledge distillation (Hoyle et al, 2020a). Zhu et al (2020) and Dieng et al (2020) jointly train words and topics in a shared embedding space. However, we train topic-word distribution as part of our model, embed it using word embeddings being learned and use resultant topic embeddings to perform attention over sequentially processed tokens.…”
Section: Introductionmentioning
confidence: 99%