2016
DOI: 10.1177/0146621616684584
|View full text |Cite
|
Sign up to set email alerts
|

An Evaluation of Interrater Reliability Measures on Binary Tasks Using d-Prime

Abstract: Many indices of interrater agreement on binary tasks have been proposed to assess reliability, but none has escaped criticism. In a series of Monte Carlo simulations, five such indices were evaluated using , an unbiased indicator of raters' ability to distinguish between the true presence or absence of the characteristic being judged. and, to a lesser extent, coefficients performed best across variations in characteristic prevalence, and raters' expertise and bias. Correlations with for , Scott's, and Gwet's w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(11 citation statements)
references
References 21 publications
0
10
0
Order By: Relevance
“…We did not code as suspicious anyone who generally thought that the study was strange (they thought it was weird, odd, or unusual that the answers were provided or that they were told not to look at the answers; they were not sure if the provided answers were correct). Between-coder agreement was high, κ = .904 (Cohen, 1960; Grant, Button, & Snook, 2017), and all disagreements were resolved by the team leads at each site.…”
Section: Resultsmentioning
confidence: 99%
“…We did not code as suspicious anyone who generally thought that the study was strange (they thought it was weird, odd, or unusual that the answers were provided or that they were told not to look at the answers; they were not sure if the provided answers were correct). Between-coder agreement was high, κ = .904 (Cohen, 1960; Grant, Button, & Snook, 2017), and all disagreements were resolved by the team leads at each site.…”
Section: Resultsmentioning
confidence: 99%
“…The third possibility is a statistical artifact. The expert group (11-year-old group) showed quite few instances of nonproficiency, which, combined with the small sample size for the analysis, biased the statistics to provide this result (Grant et al, 2017).…”
Section: Discussionmentioning
confidence: 99%
“…A recent paper by Grant, Button, and Snook (2017) suggested one resolution of this debate. In a series of Monte Carlo simulations, Grant and his colleagues used a novel criterion, d-prime, to assess the performance of five reliability measure indices.…”
Section: (Dis)agreement On Inter-rater Agreementmentioning
confidence: 99%
“…In Panel c, both Chris and Laura say "yes" more often than "no" but differ in the ratio. Grant et al (2017) referred to this kind of sensitivity as Observer Expertise, and it is this factor that is central to reliability. Second, agreement might occur because both Chris and Laura come to the task with similar assumptions about the prevalence of cooperation.…”
Section: An Inter-rater Reliability Dilemmamentioning
confidence: 99%
See 1 more Smart Citation