1999
DOI: 10.2307/3315487
|View full text |Cite
|
Sign up to set email alerts
|

Beyond kappa: A review of interrater agreement measures

Abstract: In 1960, Cohen introduced the kappa coefficient to measure chance-corrected nominal scale agreement between two raters. Since then, numerous extensions and generalizations of this interrater agreement measure have been proposed in the literature. This paper reviews and critiques various approaches to the study of interrater agreement, for which the relevant data comprise either nominal or ordinal categorical ratings from multiple raters. It presents a comprehensive compilation of the main statistical approache… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
445
0
5

Year Published

2004
2004
2017
2017

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 783 publications
(468 citation statements)
references
References 62 publications
4
445
0
5
Order By: Relevance
“…The first (CVR) and second (ENL) observer scored these traits on two separate occasions. The strength of agreement between these scoring sessions was calculated using Cohen's Kappa statistic [9]. Agreement was determined to be excellent with a correlation coefficient greater than 0.75, moderate when the correlation coefficient was between 0.40 and 0.74, and poor when it was less than 0.40.…”
Section: Methodsmentioning
confidence: 99%
“…The first (CVR) and second (ENL) observer scored these traits on two separate occasions. The strength of agreement between these scoring sessions was calculated using Cohen's Kappa statistic [9]. Agreement was determined to be excellent with a correlation coefficient greater than 0.75, moderate when the correlation coefficient was between 0.40 and 0.74, and poor when it was less than 0.40.…”
Section: Methodsmentioning
confidence: 99%
“…In addition, to prevent any reference to differences in the seriousness of the offenses, the seriousness scores were removed from the RMG list for both coders and the offenses in this list were ordered alphabetically based on their law section description. Using Cohen's Kappa formula, the interrater-agreement for the 60 cases that both coders handled was  = .63, p < .001, 95% CI [.49, .76], which is an fair to good level of agreement beyond chance (Banerjee, Capozzoli, McSweeney & Sinha, 1999;Landis & Koch, 1977). Accordingly, the incarceration seriousness scores that corresponded to the offense types from the RMG list were assigned to each mediation case based on the 199 codings of the first coder.…”
Section: Assigning Population-based Incarceration Seriousness Scoresmentioning
confidence: 99%
“…For instance, the negotiated IRR of the cognitive presence coding scheme ranged from RC=0.6522-0.7957 and K=0.5014-0.6513. This signified that there was a poor agreement (according to [29]) between the two coders. Furthermore, the comments of the coders disclosed the difficulties that they had encountered when following the coding schemes.…”
Section: Resultsmentioning
confidence: 98%