2022
DOI: 10.3389/fsoc.2022.886498
|View full text |Cite
|
Sign up to set email alerts
|

A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts

Abstract: The richness of social media data has opened a new avenue for social science research to gain insights into human behaviors and experiences. In particular, emerging data-driven approaches relying on topic models provide entirely new perspectives on interpreting social phenomena. However, the short, text-heavy, and unstructured nature of social media content often leads to methodological challenges in both data collection and analysis. In order to bridge the developing field of computational science and empiric… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
151
0
7

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 276 publications
(161 citation statements)
references
References 59 publications
3
151
0
7
Order By: Relevance
“…Specifically, we looked at the frequency of post publishing. Then, we analyzed the thematic composition of these posts using BERTopic (Grootendorst, 2022), a novel topic modeling technique that supports dynamic topic modeling for multilingual text corpora and demonstrates high performance across different domains (Egger & Yu, 2022). We chose topic modeling for the analysis of frames since this method offers an advantage of mining emerging frames from a body of texts rather than imposing pre-determined classification categories on the data set, and is thus especially useful in fast-changing and previously unexplored contexts (Groshek & Engelbert, 2013) such as the one we focus on.…”
Section: Data Analysis Rq1: Representation Of the War In Ukraine Thro...mentioning
confidence: 99%
“…Specifically, we looked at the frequency of post publishing. Then, we analyzed the thematic composition of these posts using BERTopic (Grootendorst, 2022), a novel topic modeling technique that supports dynamic topic modeling for multilingual text corpora and demonstrates high performance across different domains (Egger & Yu, 2022). We chose topic modeling for the analysis of frames since this method offers an advantage of mining emerging frames from a body of texts rather than imposing pre-determined classification categories on the data set, and is thus especially useful in fast-changing and previously unexplored contexts (Groshek & Engelbert, 2013) such as the one we focus on.…”
Section: Data Analysis Rq1: Representation Of the War In Ukraine Thro...mentioning
confidence: 99%
“…O NMF usa uma abordagem de álgebra linear para extrac ¸ão de tópicos. [Egger and Yu 2022] avaliam e comparam o desempenho de quatro técnicas de modelagem de tópicos: LDA, NMF, Top2Vec e BERTopic. Além dos métodos já citados anteriormente, o Top2Vec usa embeddings para representar os textos curtos.…”
Section: Trabalhos Relacionadosunclassified
“…Ao analisar os trabalhos apresentados nessa sec ¸ão, percebemos que alguns autores elaboraram um estudo exploratório de métodos específicos para textos curtos [Qiang et al 2020, Costa andDuarte 2019], enquanto outros focaram em avaliar o desempenho de métodos de modelagem de tópicos para textos genéricos, mas aplicados no contexto de textos curtos [Albalawi et al 2020]. Por fim, as abordagens apresentadas verificaram o desempenho apenas de métodos de textos genéricos com e sem o uso de embeddings [Egger and Yu 2022]. Nenhum dos trabalhos relacionados explorou o que este artigo propõe: uma análise comparativa entre métodos tradicionais (LDA), modelos para textos curtos (GSDMM e PTM) e novas abordagens para textos genéricos com o uso de embeddings (BERTopic).…”
Section: Trabalhos Relacionadosunclassified
See 1 more Smart Citation
“…With that, it can automatically detect topics present in the text and generate jointly embedded topic, document, and word vectors. There are studies that compared LDA and Top2Vec (Ma et al, 2021 ; Egger and Yu, 2022 ). They reported that Top2Vec produced qualitatively higher quality results than LDA.…”
Section: Introductionmentioning
confidence: 99%