With increased emphasis on teacher quality in the Race to the Top federal grants program, rater agreement is an important topic in teacher evaluation. Variations of kappa have often been used to assess inter-rater reliability (IRR). Research has shown that kappa suffers from a paradox where high exact agreement can produce low kappa values. Two chance-corrected methods of IRR were examined to determine if Gwet’s AC1 statistic is a more stable estimate than kappa. Findings suggest that Gwet’s AC1 statistic outperforms kappa as a chance-corrected measure of IRR when compared to exact agreement. Findings suggest Gwet’s AC1 statistic shows promise for future IRR studies in a teacher evaluation context.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.