2007
DOI: 10.1002/j.2333-8504.2007.tb02063.x
|View full text |Cite
|
Sign up to set email alerts
|

Construct Validity of E‐rater® in Scoring Toefl® Essays

Abstract: This study examined the construct validity of the e-rater ® automated essay scoring engine as an alternative to human scoring in the context of TOEFL ® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two e-rater scores were investigated in this study, the first based on optimally predicting the human essay score and the second based on equal weights for the different features of e-rater.Within a multitrait-multimethod approach, the correlations and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
51
0
2

Year Published

2011
2011
2018
2018

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 40 publications
(57 citation statements)
references
References 15 publications
(22 reference statements)
4
51
0
2
Order By: Relevance
“…For all analyses involving individual raters, ratings have been randomly assigned to Rater 1 or Rater 2. Table 2 shows interrater reliability statistics for these ratings; overall, they are comparable to statistics found in similar studies (e.g., Attali, 2007;Attali & Burstein, 2006). For example, Attali and Burstein (2006) reported exact agreement rates of two human raters of .59 and one human rater with e-rater of .58.…”
Section: Methodssupporting
confidence: 74%
See 3 more Smart Citations
“…For all analyses involving individual raters, ratings have been randomly assigned to Rater 1 or Rater 2. Table 2 shows interrater reliability statistics for these ratings; overall, they are comparable to statistics found in similar studies (e.g., Attali, 2007;Attali & Burstein, 2006). For example, Attali and Burstein (2006) reported exact agreement rates of two human raters of .59 and one human rater with e-rater of .58.…”
Section: Methodssupporting
confidence: 74%
“…This study adds to the growing literature related to the validation and use of e-rater for TOEFL essays (e.g., Attali, 2007Attali, , 2008Attali & Burstein, 2006;Chodorow & Burstein, 2004;Enright & Quinlan, 2008;Lee et al, 2008). In terms of the validity argument for the TOEFL outlined by Chapelle et al (2008), the study provides evidence that support the inferences of generalization (across tasks and raters) and extrapolation to other criteria of writing ability in academic contexts.…”
Section: Implications and Future Directionssupporting
confidence: 59%
See 2 more Smart Citations
“…First, factor analyses were performed to investigate the structure of e-rater features when the DWU measures are added to the set. Factor analyses of both TOEFL computer-based test essays (Attali, 2007) and essays written by native English speakers from a wide developmental range (4th to 12th grade; Attali & Powers, 2008, 2009) revealed a similar underlying structure of the e-rater features. This three-factor structure has an attractive hierarchical linguistic interpretation with a word choice factor, a grammaticalconventions-within-a-sentence factor, and a fluency factor.…”
Section: Evaluation Setupmentioning
confidence: 93%