Reliability of Ratings for Multiple Judges: Intraclass Correlation and Metric Scales

Fagot, Robert F.

doi:10.1177/014662169101500101

Cited by 16 publications

(5 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…However, it is exactly due to this that this practice has been criticised in some studies as lacking reliability (see, for example, Boud, 1986;Swanson et al, 1991). A study by Falchikov & Magin (1997, p. 386) suggests that low rater reliability can be overcome with the use of multiple ratings and this claim is in line with other studies (see, for example, Fagot, 1991;Houston et al, 1991;Magin, 1993). In another study, Falchikov (1986) has shown that devolving the assessment of group processes to peers can be carried out with a reasonable degree of reliability although peer-teacher correlational analysis is obviously not possible in situations where only students give assessments.…”

Section: Peer Assessment Of Group Worksupporting

confidence: 68%

Making a Difference: Using peers to assess individual students' contributions to a group project

Cheng¹,

Warren²

2000

Teaching in Higher Education

133

107

View full text Add to dashboard Cite

show abstract

Section: Peer Assessment Of Group Worksupporting

confidence: 68%

Making a Difference: Using peers to assess individual students' contributions to a group project

Cheng¹,

Warren²

2000

Teaching in Higher Education

133

107

View full text Add to dashboard Cite

show abstract

“…A very large number of assessors appears to produce marks that resemble those of the teacher less well than marks produced by a smaller number of raters or singletons. We were surprised to find that singletons performed as well as larger groups of students, given that it is generally acknowledged that multiple ratings are superior to single ones (e.g., Cox, 1967;Fagot, 1991). It has been argued that the use of multiple raters tends to improve reliability by increasing the ratio of true score variance to error variance (e.g., Ferguson, 1966).…”

Section: Discussionmentioning

confidence: 94%

Student Peer Assessment in Higher Education: A Meta-Analysis Comparing Peer and Teacher Marks

Falchikov

Goldfinch

2000

Review of Educational Research

862

506

View full text Add to dashboard Cite

Forty-eight quantitative peer assessment studies comparing peer and teacher marks were subjected to meta-analysis. Peer assessments were found to resemble more closely teacher assessments when global judgements based on well understood criteria are used rather than when marking involves assessing several individual dimensions. Similarly, peer assessments better resemble faculty assessments when academic products and processes, rather than professional practice, are being rated. Studies with high design quality appear to be associated with more valid peer assessments than those which have poor experimental design. Hypotheses concerning the greater validity of peer assessments in advanced rather than beginner courses and in science and engineering rather than in other discipline areas were not supported. In addition, multiple ratings were not found to be better than ratings by singletons. The study pointed to differences between self and peer assessments, which are explored briefly. Results are discussed and fruitful areas for further research in peer assessment are suggested.

show abstract

“…Furthermore, although meta‐analyses have revealed that the criterion‐related validity coefficient of peer assessment can reach as high as 0.69, validity coefficients can differ significantly between studies (Falchikov & Goldfinch, 2000). In addition, although the validity will theoretically increase with the number of assessors (Fagot, 1991; Houston, Raymond, & Svec, 1991), in practice this assumption is often limited. For example, the meta‐analysis of Falchikov and Goldfinch (2000) revealed that the criterion‐related validity obtained by a single assessor is not necessarily lower than that obtained by multiple assessors.…”

Section: Introductionmentioning

confidence: 99%

How many heads are better than one? The reliability and validity of teenagers' self‐ and peer assessments

Sung

Chang

Chang³

et al. 2009

Journal of Adolescence

View full text Add to dashboard Cite

Self‐ and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self‐ and peer assessments, and how the number of raters influences the reliability and validity of self‐ and peer assessments. The first study involved 116 seventh graders (the first grade of middle school), where students individually playing musical recorders were subject to self‐ and peer assessments. The second study involved 110 eighth graders, with Web pages constructed by students being subject to self‐ and peer assessments. Generalizability theory and criterion‐related validity were used to obtain the reliability and validity coefficients of the self‐ and peer ratings. Analyses of variance were used to compare differences in self‐ and peer ratings between low‐ and high‐achieving students. The coefficients of reliability and validity increased with the number of raters in both studies, reaching the acceptable levels of 0.80 and 0.70, respectively, with 3 or 4 raters in the first study (involving assessments of individual performance) and with 14–17 raters in the second study (involving assessments of group work). Furthermore, low‐ and high‐achieving students tended to over‐ and underestimate the quality of their work in self‐assessment, respectively. The discrepancy between the ratings of students and experts was higher in group‐work assessments then in individual‐work assessments. The results have both theoretical and practical implications for researchers and teachers.

show abstract

Reliability of Ratings for Multiple Judges: Intraclass Correlation and Metric Scales

Cited by 16 publications

References 14 publications

Making a Difference: Using peers to assess individual students' contributions to a group project

Making a Difference: Using peers to assess individual students' contributions to a group project

Student Peer Assessment in Higher Education: A Meta-Analysis Comparing Peer and Teacher Marks

How many heads are better than one? The reliability and validity of teenagers' self‐ and peer assessments

Contact Info

Product

Resources

About