PsycEXTRA Dataset 2012
DOI: 10.1037/e557102013-086
|View full text |Cite
|
Sign up to set email alerts
|

Matching Results of Latent Dirichlet Allocation for Text

Abstract: Many approaches have been introduced to enable Latent Dirichlet Allocation (LDA) models to be updated in an online manner. This includes inferring new documents into the model, passing parameter priors to the inference algorithm or a mixture of both, leading to more complicated and computationally expensive models. We present a method to match and compare the resulting LDA topics of different models with light weight easy to use similarity measures. We address the on-line problem by keeping the model inference… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(13 citation statements)
references
References 7 publications
0
12
0
1
Order By: Relevance
“…This calls for reliability checks that indicate the robustness of the topic solutions. We provide an easy-to-calculate reliability metric (Niekler & Jähnichen, 2012) and show that random initialization is a weakness in the LDA architecture. It is clearly inferior to non-random initialization methods, which, as we demonstrate, can improve the reliability of an LDA topic model.…”
Section: Advantages Limitations and Challenges Of Applying Ldamentioning
confidence: 99%
“…This calls for reliability checks that indicate the robustness of the topic solutions. We provide an easy-to-calculate reliability metric (Niekler & Jähnichen, 2012) and show that random initialization is a weakness in the LDA architecture. It is clearly inferior to non-random initialization methods, which, as we demonstrate, can improve the reliability of an LDA topic model.…”
Section: Advantages Limitations and Challenges Of Applying Ldamentioning
confidence: 99%
“…To compare the similarity of frames used by climate skeptics in their online communication and in mass media reporting, we relied on the Jensen-Shannon divergence (JSD). This is a smoothed and symmetric derivative of the Kullback-Leibler (KL) divergence, which is a common measure when comparing distributions [78]. The normalized outcomes of the JSD can be used as measure of similarity between two probability distributions and is therefore well suited for the comparison of the topic distributions of our online and offline samples.…”
Section: Plos Onementioning
confidence: 99%
“…B. Mantyla u. a., 2018;Su u. a., 2016;Greene u. a., 2014;Nguyen u. a., 2014;Stevens u. a., 2012;Newman u. a., 2011;Mimno u. a., 2011) als auch aus dem Bereich der Kommunikationswissenschaften (z. B. Maier u. a., 2018;Niekler, 2016;Niekler und Jähnichen, 2012) Beide Beispiele verdeutlichen, dass interdisziplinäre Innovationen, d. h. Forschungsergebnisse, die in allen beteiligten Disziplinen einen Erkenntnisfortschritt darstellen, im Fall von DoCMA erst durch eine vorangegangene Phase der Annäherung möglich wurden. Die Phase des Utilisierens und Adaptierens bildete hier also ein notwendiges Fundament, das gegenseitiges Verständnis (um nicht zu sagen: gegenseitige Empathie) förderte und so die Identifikation integrativ-synthetischer Fragestellungen ermöglichte.…”
Section: Praxisbeispiel 1: Effizientes Samplingunclassified