2019
DOI: 10.1017/pan.2019.26
|View full text |Cite
|
Sign up to set email alerts
|

Word Embeddings for the Analysis of Ideological Placement in Parliamentary Corpora

Abstract: Word embeddings, the coefficients from neural network models predicting the use of words in context, have now become inescapable in applications involving natural language processing. Despite a few studies in political science, the potential of this methodology for the analysis of political texts has yet to be fully uncovered. This paper introduces models of word embeddings augmented with political metadata and trained on large-scale parliamentary corpora from Britain, Canada, and the United States. We fit the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
75
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 96 publications
(75 citation statements)
references
References 61 publications
0
75
0
Order By: Relevance
“…Suggesting that latent knowledge regarding future discoveries is to an extent embedded in past academic papers (Tshitoyan et al 2019). Embeddings have also been successful at capturing latent concepts such as ideology, providing an integrated framework for an indirect study of political language (Rheault and Cochrane 2020).…”
Section: An Epistemology Of Word Embeddingsmentioning
confidence: 99%
“…Suggesting that latent knowledge regarding future discoveries is to an extent embedded in past academic papers (Tshitoyan et al 2019). Embeddings have also been successful at capturing latent concepts such as ideology, providing an integrated framework for an indirect study of political language (Rheault and Cochrane 2020).…”
Section: An Epistemology Of Word Embeddingsmentioning
confidence: 99%
“…17 We have experimented with two different ways of providing snippets to the classifier in the form of numerical vectors (the type of input required by the algorithm to map them as points in space): a) the first was to represent them as term frequency-inverse document frequency (TF-IDF) vectors, which capture word-frequency information (Manning et al 2008); b) the second as averaged word embeddings (w-emb) (Mikolov et al 2013), which are vectors that capture semantic properties of texts, such as addressing the same topic even when using different words. While word frequency vectors have been largely adopted in text-based political science research (Hillard et al 2008;D'Orazio et al 2014;Merz et al 2016), word embeddings, due to their novelty, have been only recently employed, in particular for ideological positioning (Rheault and Cochrane 2019;Nanni et al 2019). Choosing between the two representations largely depends on whether the information that the classifier is aimed to capture is mentioned explicitly or is conveyed in a more implicit way.…”
Section: Automatic Identification Of Directionalitymentioning
confidence: 99%
“…At the beginning of this century, scientists started to use scaling algorithms such as wordfish and wordscore to place party manifestos on an ideological scale (Laver, Benoit, and Garry 2003;Laver and Garry 2000). Today, the research which uses Tzelgov and Olander (2018), and Benoit and Herzog (2017); Schwarz, Traber, and Benoit (2017), Lauderdale and Herzog (2016), and Debus and Bäck (2014); Proksch and Slapin (2012), Proksch and Slapin (2010), and Slapin and Proksch (2008) scaling algorithms Diermeier et al (2012) SVM Gentzkow, Shapiro, and Taddy (2016) custom model polarization Curini, Hino, and Osaka (2018) wordfish Goet (2019), Abercrombie and Batista-Navarro (2018), and Peterson and Spirling (2018) text classifier Spirling, Huang, and Patrick (2018) bayesian Rheault and Cochrane (2019) word embeddings sentiment Rheault, Beelen, et al (2016) GloVe Proksch, Lowe, et al (2018) Multilingual dictionary floor time Blumenau (2019) and Bäck, Debus, and Müller (2014) regression topical prevalence Høyland and Søyland (2019) and Greene and Cross (2017) topic models…”
Section: Parliamentary Documents As Datamentioning
confidence: 99%
“…However, with advances in the fields of computer science, i.e. natural language processing, and linguistics, political scientists have started to explore classification algorithms (Peterson and Spirling 2018;Goet 2019; Abercrombie and Batista-Navarro 2018), topic models (Høyland and Søyland 2019;Greene and Cross 2017) and word embeddings for their research (Rheault and Cochrane 2019;Rheault, Beelen, et al 2016).…”
Section: Parliamentary Documents As Datamentioning
confidence: 99%