1992
DOI: 10.1177/014662169201600109
|View full text |Cite
|
Sign up to set email alerts
|

Estimating Individual Rater Reliabilities

Abstract: Rating scales have no inherent reliability that is independent of the observers who use them. The often reported interrater reliability is an average of perhaps quite different individual rater reliabilities. It is possible to separate out the individual rater reliabilities given a number of independent raters who observe the same sample of ratees. Under certain assumptions, an external measure can replace one of the raters, and individual reliabilities of two independent raters can be estimated. In a somewhat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

1995
1995
2022
2022

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 15 publications
0
4
0
1
Order By: Relevance
“…Van den Bergh & Eiting (1989) assumed multiple quantitative ratings to be congeneric, tau-equivalent, or parallel and then used LISREL (Joreskog & Sorbom, 1988) to fit these models. Overall & Magee (1992) proposed several simple models, such as the disattenuation model, the common factor model, the external criterion model, the treatment effects model, and the regression model, to estimate individual reliabilities of raters from simple bivariate correlations among their ratings. Item response modeling focuses on rater severity as an important aspect of rater consistency that needs to be examined.…”
Section: Three Issuesmentioning
confidence: 99%
“…Van den Bergh & Eiting (1989) assumed multiple quantitative ratings to be congeneric, tau-equivalent, or parallel and then used LISREL (Joreskog & Sorbom, 1988) to fit these models. Overall & Magee (1992) proposed several simple models, such as the disattenuation model, the common factor model, the external criterion model, the treatment effects model, and the regression model, to estimate individual reliabilities of raters from simple bivariate correlations among their ratings. Item response modeling focuses on rater severity as an important aspect of rater consistency that needs to be examined.…”
Section: Three Issuesmentioning
confidence: 99%
“…However, research has shown that substantial construct-irrelevant variance is introduced into essay scores as a consequence of the rating process alone ( Congdon and McQueen, 2000 ). Even if the rating rubric has been constructed carefully, the reliability and validity of the rating process still depends mainly on the implementation of the rating activities ( Overall and Magee, 1992 ). Because of variations in both the characteristics and status of raters, together with fluctuations between various rating environments, individual raters struggle to remain consistent across multiple rating processes, and different raters may assess the same samples differently.…”
Section: Introductionmentioning
confidence: 99%
“…Prominent among these sources is the variance associated with raters. This is a reflection of the concern that, no matter how carefully constructed, the reliability of a rating scale is critically dependent on the raters who operate it (Overall & Magee, 1992). As Dunbar, Koretz, and Hoover (1991) put it, "fallible raters can wreak havoc on the trustworthiness of scores and add a term to the reliability equation that does not exist in the tests that can be scored objectively."…”
mentioning
confidence: 99%