Latent Dirichlet Allocation with topic-in-set knowledge

Andrzejewski, David; Zhu, Xiaojin

doi:10.3115/1621829.1621835

Cited by 125 publications

(57 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this way, we can "word seed" a topic to a given set of words. A similar method has been used by Jagarlamudi et al (2012) and Andrzejewski & Zhu (2009), but unlike those previous approaches, we do not change the underlying model to incorporate this prior knowledge. Instead, we put a restriction on which topics the seeding words are allowed to belong to by using conditional Dirichlet distributions, conditioned on the probability being zero in all topics other than the seeded ones.…”

Section: Methodsmentioning

confidence: 99%

Voices from the far right: a text analysis of Swedish parliamentary debates

Magnusson¹,

Öhrvall²,

Barrling³

et al. 2018

Preprint

View full text Add to dashboard Cite

In this paper we study the effects of a radical right party entering a national parliament, on the parliament discourse. We follow the classification developed by Meguid (2008) and use a probabilistic topic model approach to analyze the 300,000 speeches delivered in the Swedish parliament between 1994 and 2017. Our results indicate that immigration became a more prevalent topic in party leader debates when the Sweden Democrats entered the parliament in 2010. The other parties started to address immigration more, but still not to the extent that the Sweden Democrats did. In 2015, as Sweden faced a migration crisis, immigration became a more salient issue in the parliamentary debates. This could be seen as an external shock that forced the mainstream parties to put more emphasis on the topic of immigration. We conclude that the mainstream parties used a partly dismissive, partly adversarial strategy in their speeches when the SD entered the parliament. The migration crises in 2015 made them focus more on immigration and they thereby adopted a more adversarial strategy.

show abstract

Section: Methodsmentioning

confidence: 99%

Voices from the far right: a text analysis of Swedish parliamentary debates

Magnusson¹,

Öhrvall²,

Barrling³

et al. 2018

Preprint

View full text Add to dashboard Cite

show abstract

“…works, many variations have been proposed [1,2,4,6,9,10,26,27,29,30,32,37,40]. In this paper, we only focus on the variations that add supervised information in the form of latent topic assignments.…”

Section: Introductionmentioning

confidence: 99%

“…To the best of our knowledge, this is the first constrained LDA model which can process large scale constraints in the forms of must-links and cannot-links. There are two existing work by Andrzejewski and Zhu [1,2] that are related to the proposed model. However, [1] only considers must-link constraints.…”

Section: Introductionmentioning

confidence: 99%

“…There are two existing work by Andrzejewski and Zhu [1,2] that are related to the proposed model. However, [1] only considers must-link constraints. In [2], the number of maximal cliques grow exponentially in the process of encoding constraints.…”

Section: Introductionmentioning

confidence: 99%

“…In [1], predefined topic-in-set knowledge (which means predefined terms for certain topics) was added to supervise the topic assignment for individual terms. Compared with our model, their model only used the must-link knowledge, not cannot-links.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Constrained LDA for Grouping Product Features in Opinion Mining

Zhai

Liu

et al. 2011

Advances in Knowledge Discovery and Data Mining

105

View full text Add to dashboard Cite

Abstract. In opinion mining of product reviews, one often wants to produce a summary of opinions based on product features/attributes. However, for the same feature, people can express it with different words and phrases. To produce an effective summary, these words and phrases, which are domain synonyms, need to be grouped under the same feature. Topic modeling is a suitable method for the task. However, instead of simply letting topic modeling find groupings freely, we believe it is possible to do better by giving it some pre-existing knowledge in the form of automatically extracted constraints. In this paper, we first extend a popular topic modeling method, called LDA, with the ability to process large scale constraints. Then, two novel methods are proposed to extract two types of constraints automatically. Finally, the resulting constrained-LDA and the extracted constraints are applied to group product features. Experiments show that constrained-LDA outperforms the original LDA and the latest mLSA by a large margin.

show abstract