2021
DOI: 10.1017/pan.2021.37
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Domain Topic Classification for Political Texts

Abstract: We introduce and assess the use of supervised learning in cross-domain topic classification. In this approach, an algorithm learns to classify topics in a labeled source corpus and then extrapolates topics in an unlabeled target corpus from another domain. The ability to use existing training data makes this method significantly more efficient than within-domain supervised learning. It also has three advantages over unsupervised topic models: the method can be more specifically targeted to a research question … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
25
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 23 publications
(35 citation statements)
references
References 36 publications
(54 reference statements)
2
25
0
Order By: Relevance
“…Using the text classification method, we can automate many types of analyses in political science. As listed in the examples in Figure 2, researchers can detect political perspective of news articles (Huguet Cabot et al, 2020), the stance in media on a certain topic (Luo et al, 2020), whether campaigns use positive or negative sentiment (Ansolabehere and Iyengar, 1995), which issue area is the legislation about (Adler and Wilkerson, 2011), topics in parliament speech (Albaugh et al, 2013;Osnabrügge et al, 2021), congressional bills (Hillard et al, 2008;Collingwood and Wilkerson, 2012) and political agenda (Karan et al, 2016), whether the international statement is peaceful or belligerent (Schrodt, 2000), whether a speech contains positive or negative sentiment (Schumacher et al, 2016), and whether a U.S. Circuit Courts case decision is conservative or liberal (Hausladen et al, 2020).…”
Section: Nlp For Text Analysismentioning
confidence: 99%
“…Using the text classification method, we can automate many types of analyses in political science. As listed in the examples in Figure 2, researchers can detect political perspective of news articles (Huguet Cabot et al, 2020), the stance in media on a certain topic (Luo et al, 2020), whether campaigns use positive or negative sentiment (Ansolabehere and Iyengar, 1995), which issue area is the legislation about (Adler and Wilkerson, 2011), topics in parliament speech (Albaugh et al, 2013;Osnabrügge et al, 2021), congressional bills (Hillard et al, 2008;Collingwood and Wilkerson, 2012) and political agenda (Karan et al, 2016), whether the international statement is peaceful or belligerent (Schrodt, 2000), whether a speech contains positive or negative sentiment (Schumacher et al, 2016), and whether a U.S. Circuit Courts case decision is conservative or liberal (Hausladen et al, 2020).…”
Section: Nlp For Text Analysismentioning
confidence: 99%
“…Researchers often adopt a weighting scheme, called term frequency-inverse document frequency (TF-IDF), that gives more weight to less frequent words. The main advantage of dictionary methods is the ease of interpretation, while the main disadvantage is the low design efficiency: before conducting any analysis, researchers must spend a significant amount of time designing a classification scheme, by compiling an exhaustive list of keywords that belong to each category (Osnabrügge et al, 2021). A second method for automated text analysis is probabilistic topic modeling.…”
Section: Independent Variables: Imf Program Participation and Conditi...mentioning
confidence: 99%
“…A topic is a distribution over a fixed vocabulary (Blei, 2012); for example, the topic natural resources has a fixed vocabulary that includes words like oil , mining , and hydrocarbon . Topic models have high design efficiency (Osnabrügge et al, 2021), because they do not require training sets and are suitable for new discoveries: they can parse the data to identify hidden patterns that are not immediately evident to the human eye (like the unobservable influence of IMF conditionality on domestic legislation).…”
Section: Data and Descriptive Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…We are interested in forming a predicted probability of the source of a document for scoring influence in a second corpus. Other related methods arePeterson and Spirling (2018) andOsnabrügge et al (2021).5 We have fewer snippets from FNC than from CNN/MSNBC. Thus, we randomly under-sample the snippets from the CNN/MSNBC corpus to match the number of snippets from FNC.6 Previous work has shown that supervised learning models using n-grams are rarely sensitive to the specific choices in pre-processing and featurization (e.g.,Denny and Spirling, 2018).…”
mentioning
confidence: 99%