Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs

Lind, Fabienne; Gruber, Maria; Boomgaarden, Hajo G.

doi:10.1080/19312458.2017.1317338

Cited by 53 publications

(44 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Given that coders are treated as interchangeable, any (potentially) remaining coder idiosyncrasies (either coder-specific systematic errors or random measurement errors) are in effect no longer considered, neither in the analyses nor in the interpretations of the findings (see , for a detailed discussion on this issue). When there is a sufficiently large number of coders, or each materials are coded by multiple coders ("duplicated coding" as in some SML applications or in crowdcoding: see Lind, Gruber, & Boomgaarden, 2017;Scharkow, 2013), the impact of coder idiosyncrasiesespecially random errorswould diminish, as they will cancel each other out as long as the number of coders/ duplicated coding instances increases. Nevertheless, remaining systematic errors in coder idiosyncrasies may still introduce bias in gold standard materials with respect to the target of inference, especially for data with a higher level of intercoder reliability (i.e., a systematic deviation from the true target).…”

Section: Design and Setup Of Monte Carlo Simulationsmentioning

confidence: 99%

In Validations We Trust? The Impact of Imperfect Human Annotations as a Gold Standard on the Quality of Validation of Automated Content Analysis

Song

Tolochko

Eberl

et al. 2020

Political Communication

Self Cite

View full text Add to dashboard Cite

Political communication has become one of the central arenas of innovation in the application of automated analysis approaches to ever-growing quantities of digitized texts. However, although researchers routinely and conveniently resort to certain forms of human coding to validate the results derived from automated procedures, in practice the actual "quality assurance" of such a "gold standard" often goes unchecked. Contemporary practices of validation via manual annotations are far from being acknowledged as best practices in the literature, and the reporting and interpretation of validation procedures differ greatly. We systematically assess the connection between the quality of human judgment in manual annotations and the relative performance evaluations of automated procedures against true standards by relying on large-scale Monte Carlo simulations. The results from the simulations confirm that there is a substantially greater risk of a researcher reaching an incorrect conclusion regarding the performance of automated procedures when the quality of manual annotations used for validation is not properly ensured. Our contribution should therefore be regarded as a call for the systematic application of high-quality manual validation materials in any political communication study, drawing on automated text analysis procedures.

show abstract

Section: Design and Setup Of Monte Carlo Simulationsmentioning

confidence: 99%

In Validations We Trust? The Impact of Imperfect Human Annotations as a Gold Standard on the Quality of Validation of Automated Content Analysis

Song

Tolochko

Eberl

et al. 2020

Political Communication

Self Cite

View full text Add to dashboard Cite

show abstract

“…Crowd-coding is both hailed as a useful strategy but also viewed critically (Snow et al 2008, Benoit et al 2016, Lind et al 2017, Dreyfuss 2018. Because Krippendorf's alpha was not higher for certain categories we carried out additonal analyses to see whether our results remain robust to the exclusion of certain workers.…”

Section: Crowd-coding Of Open-ended Responsesmentioning

confidence: 99%

Does Suffering Suffice? An Experimental Assessment of Desert Retributivism

Bauer¹,

Poama²

2019

Preprint

View full text Add to dashboard Cite

Michael S. Moore is among the most prominent normative theorists to argue that retributive justice, understood as the deserved suffering of offenders, justifies punishment. Moore claims that the principle of retributive justice is pervasively supported by our judgments of justice and sufficient to ground punishment. We offer an experimental assessment of these two claims, (1) the pervasiveness claim, according to which people are widely prone to endorse retributive judgments, and (2) the sufficiency claim, according to which no non-retributive principle is necessary for justifying punishment. We test these two claims in a survey and a related survey experiment in which we present participants (N =~900) with the stylized description of a criminal case. Our results seem to invalidate claim (1) and provide mixed results concerning claim (2). We conclude that retributive justice theories which advance either of these two claims need to reassess their evidential support. Address: University Mannheim, MZES, A5 6, 68159 Mannheim. Data and RMarkdown code to fully reproduce the study are available upon request and will be stored online in the Harvard dataverse upon publication.

show abstract

“…This simple but powerful idea that good collective decisions can emanate from various averaged independent judgements of non-experts is long discussed in academia, business and popular science (see Surowiecki 2004;Lehman & Zobel 2017). Yet, notwithstanding instructive earlier studies with positive conclusions regarding the validity of crowd-coded data (e.g., Berinsky et al 2014;Haselmayer & Jenny 2016;Lind et al 2017), it seems fair to say that crowd-coding is only starting to gain traction in political science at large since Benoit et al (2016) have convincingly argued that the results of expert judgements -still considered the gold standard by many (e.g., when it comes to the location of parties) -can be matched with crowd-coding, at least for simple coding tasks. This is significant, since experts are expensive and in short supply and automated (coding) methods are not yet good enough at extracting meaning (Benoit et al 2016: 280).…”

Section: Introductionmentioning

confidence: 96%

“…; Haselmayer & Jenny ; Lind et al. ), it seems fair to say that crowd‐coding is only starting to gain traction in political science at large since Benoit et al. () have convincingly argued that the results of expert judgements – still considered the gold standard by many (e.g., when it comes to the location of parties) – can be matched with crowd‐coding, at least for simple coding tasks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Can the online crowd match real expert judgments? How task complexity and coder location affect the validity of crowd‐coded data

Horn

2018

European J Political Res

View full text Add to dashboard Cite

Crowd‐coding is a novel technique that allows for fast, affordable and reproducible online categorisation of large numbers of statements. It combines judgements by multiple, paid, non‐expert coders to avoid miscoding(s). It has been argued that crowd‐coding could replace expert judgements, using the coding of political texts as an example in which both strategies produce similar results. Since crowd‐coding yields the potential to extend the replication standard to data production and to ‘scale’ coding schemes based on a modest number of carefully devised test questions and answers, it is important that its possibilities and limitations are better understood. While previous results for low complexity coding tasks are encouraging, this study assesses whether and under what conditions simple and complex coding tasks can be outsourced to the crowd without sacrificing content validity in return for scalability. The simple task is to decide whether a party statement counts as positive reference to a concept – in this case: equality. The complex task is to distinguish between five concepts of equality. To account for the crowd‐coder's contextual knowledge, the IP restrictions are varied. The basis for comparisons is 1,404 party statements, coded by experts and the crowd (resulting in 30,000 online judgements). Comparisons of the expert‐crowd match at the level of statements and party manifestos show that the results are substantively similar even for the complex task, suggesting that complex category schemes can be scaled via crowd‐coding. The match is only slightly higher when IP restrictions are used as an approximation of coder expertise.

show abstract

Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs

Cited by 53 publications

References 37 publications

In Validations We Trust? The Impact of Imperfect Human Annotations as a Gold Standard on the Quality of Validation of Automated Content Analysis

In Validations We Trust? The Impact of Imperfect Human Annotations as a Gold Standard on the Quality of Validation of Automated Content Analysis

Does Suffering Suffice? An Experimental Assessment of Desert Retributivism

Can the online crowd match real expert judgments? How task complexity and coder location affect the validity of crowd‐coded data

Contact Info

Product

Resources

About