2014
DOI: 10.1371/journal.pone.0087908
|View full text |Cite
|
Sign up to set email alerts
|

A Computational Approach to Qualitative Analysis in Large Textual Datasets

Abstract: In this paper I introduce computational techniques to extend qualitative analysis into the study of large textual datasets. I demonstrate these techniques by using probabilistic topic modeling to analyze a broad sample of 14,952 documents published in major American newspapers from 1980 through 2012. I show how computational data mining techniques can identify and evaluate the significance of qualitatively distinct subjects of discussion across a wide range of public discourse. I also show how examining large … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
54
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 64 publications
(60 citation statements)
references
References 27 publications
0
54
0
Order By: Relevance
“…If possible, topics are then labeled manually (e.g. Newman & Block, 2006;Evans, 2014). A possible next step is the aggregation of the results of topic-inference for all documents of a period to analyze topics over time, e.g.…”
Section: Statistical Models Of Contents With Topic Modelingmentioning
confidence: 99%
See 1 more Smart Citation
“…If possible, topics are then labeled manually (e.g. Newman & Block, 2006;Evans, 2014). A possible next step is the aggregation of the results of topic-inference for all documents of a period to analyze topics over time, e.g.…”
Section: Statistical Models Of Contents With Topic Modelingmentioning
confidence: 99%
“…In contrast, domain knowledge can also be stated beforehand as a set of hypotheses about the topics and their distributions in the corpus and over time (e.g. DiMaggio et al, 2013;Evans, 2014).…”
Section: Evaluation Of Topic Modelsmentioning
confidence: 99%
“…The next step is to verify whether the results also have external validity. The strategy available to identify external validity is to observe the temporal pattern in which the topics occur and compare this pattern with real events that took place during the period analyzed (Newman et al, 2006;Evans, 2014).…”
Section: Results: What and When -International News In Brazilian Newsmentioning
confidence: 99%
“…However, if the model produces topics with words similar to "car," "paper," and "sky," it is not possible to find an interpretable label for that topic. Regarding external validity, for some researchers it is the ability of a topic model to capture events external to it (Newman et al, 2006;Evans, 2014). For example, if, when analyzing news stories that were written between September 8 and 12, 2001 in The New York Times, the topic model does not create a topic for terrorist attack, it means that the model has no external validity.…”
Section: Topic Modelingmentioning
confidence: 99%
“…The selection of an appropriate topic model involves a variety of tradeoffs and judgments by the human researcher (Evans, 2014); the selection of the model that is the best fit for the specific research question requires both qualitative and quantitative validation techniques . As noted above, one of the limitations of topic modeling is the requirement for the researcher to select the number of topics.…”
Section: Heuristics For Evaluating Topic Modelsmentioning
confidence: 99%