Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data

Benoit, Kenneth; Conway, Drew; Lauderdale, Benjamin E.; Laver, Michael; Mikhaylov, Slava

doi:10.1017/s0003055416000058

Cited by 217 publications

(185 citation statements)

References 45 publications

Supporting

Mentioning

175

Contrasting

Unclassified

Order By: Relevance

“…These parameters represent the degree to which an expert diverges from other experts who code the same cases. This operaionalization aligns with classic definitions of reliability (Carmines and Zeller, 1979), as well as recent empirical work examining convergence among workers on crowd-sourcing platforms when coding the same cases (Benoit et al, 2016;Marquardt et al, 2017). As potential correlates of reliability, we use both demographic data from a post-survey questionnaire and the coding characteristics of experts.…”

mentioning

confidence: 84%

What Makes Experts Reliable?

et al. 2018

View full text Add to dashboard Cite

mentioning

confidence: 84%

What Makes Experts Reliable?

et al. 2018

View full text Add to dashboard Cite

“…The data generation process can be potentially very quick, even for larger amounts of data, and it likely comes at considerably lower costs than a traditional manual approach. Furthermore, crowdcoded content analysis data may potentially be more reliable and easier to replicate (Benoit, Conway, Lauderdale, Laver, & Mikhaylov, 2016). While the discipline has rather standardized procedures for manual content analysis, such are lacking for crowdsourced content analysis.…”

mentioning

confidence: 99%

Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs

Lind

Gruber

Boomgaarden

2017

Communication Methods and Measures

View full text Add to dashboard Cite

Crowdsourcing platforms are commonly used for research in the humanities, social sciences and informatics, including the use of crowdworkers to annotate textual material or visuals. Utilizing two empirical studies, this article systematically assesses the potential of crowdcoding for less manifest contents of news texts, here focusing on political actor evaluations. Specifically, Study 1 compares the reliability and validity of crowdcoded data to that of manual content analyses; Study 2 proceeds to investigate the effects of material presentation, different types of coding instructions and answer option formats on data quality. We find that the performance of the crowd recommends crowdcoded data as a reliable and valid alternative to manually coded data, also for less manifest contents. While scale manipulations affected the results, minor modifications of the coding instructions or material presentation did not significantly influence data quality. In sum, crowdcoding appears a robust instrument to collect quantitative content data.

show abstract

“…With many coders and a reasonably optimistic assumption on their individual accuracy, however, the majority is able to quite reliably select the correct category (see Figure 2). This is why crowd-sourcing approaches to content analysis can produce data of acceptable quality from multiple codings per unit by minimally trained coders (Benoit, Conway, Lauderdale, Laver, & Mikhaylov, 2016). In the end, whether or not a researcher is willing to trust a majority standard depends on his or her assumptions about the accuracy of the single coders.…”

Section: Approximations Of the Misclassification Matrixmentioning

confidence: 99%

Correcting Measurement Error in Content Analysis

Bachl

Scharkow

2017

Communication Methods and Measures

View full text Add to dashboard Cite

Conducting and reporting reliability tests has become a standard practice in content analytical research. However, the consequences of measurement error in coding data are rarely discussed or taken into consideration in subsequent analyses. In this article, we demonstrate how misclassification in content analysis leads to biased estimates and introduce matrix back-calculation as a simple remedy. Using Monte Carlo simulation, we investigate how different ways of collecting information about the misclassification process influence the effectiveness of error correction under varying conditions. The results show that error correction with an adequate set-up can often substantially reduce bias. We conclude with an illustrative example, extensions of the procedure, and some recommendations.

show abstract

Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data

Cited by 217 publications

References 45 publications

What Makes Experts Reliable?

What Makes Experts Reliable?

Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs

Correcting Measurement Error in Content Analysis

Contact Info

Product

Resources

About