Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2021
DOI: 10.1017/pan.2021.33
|View full text |Cite
|
Sign up to set email alerts
|

Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures

Abstract: Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically repor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
26
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 35 publications
(33 citation statements)
references
References 51 publications
0
26
0
Order By: Relevance
“…UML methods cluster data without relying on manual coding, finding similar words or texts in the data and grouping them together . While they are best suited for exploratory analyses, these methods can be repurposed to measurement of concepts of interest in text Ying, Montgomery, and Stewart 2022). Here we assess their performance in capturing subtle internal states in short texts using the following procedure: (1) we use UML models to group texts, (2) examine whether some of these groups correspond to our coding categories and, if so, (3) code texts that belong to these groups for corresponding categories.…”
Section: Step 2-methods Choicementioning
confidence: 99%
See 1 more Smart Citation
“…UML methods cluster data without relying on manual coding, finding similar words or texts in the data and grouping them together . While they are best suited for exploratory analyses, these methods can be repurposed to measurement of concepts of interest in text Ying, Montgomery, and Stewart 2022). Here we assess their performance in capturing subtle internal states in short texts using the following procedure: (1) we use UML models to group texts, (2) examine whether some of these groups correspond to our coding categories and, if so, (3) code texts that belong to these groups for corresponding categories.…”
Section: Step 2-methods Choicementioning
confidence: 99%
“…Here we assess their performance in capturing subtle internal states in short texts using the following procedure: (1) we use UML models to group texts, (2) examine whether some of these groups correspond to our coding categories and, if so, (3) code texts that belong to these groups for corresponding categories. While we rely on our interpretation and expertise in matching groups to coding categories, researchers can resort to more sophisticated methods for the evaluation, validation, and labelling of results obtained with UML models (Ying et al 2022). In this overview, we survey several UML algorithms.…”
Section: Step 2-methods Choicementioning
confidence: 99%
“…The researchers appreciate that there are existing topic modelling techniques and machine learning approaches for computationally extracting topics and classifying users 34 36 . However, the researchers have gone through the pain of manually annotating the data because of our research interest to further develop knowledge about the identified topics and user categories.…”
Section: Methodsmentioning
confidence: 99%
“…This path forward is promising but also entails risks, because there is no guarantee that an algorithm operating on word frequencies will arrive at a meaningful definition of a cultural category. For this reason, unsupervised methods to summarize text data always place the burden on the researcher to justify their chosen interpretation and validate the utility of the topics learned (Grimmer et al, 2022;Grimmer and Stewart, 2013;Ying et al, 2021). LDA thus illustrates a key idea that applies more broadly to unsupervised methods: while these methods may appear to inductively discover insights from the data alone, they actually involve extensive theoretical work on the part of the researcher to justify and interpret the result.…”
Section: Dimension Reduction: Unsupervised Machine Learning Can Summa...mentioning
confidence: 99%