2023
DOI: 10.1109/tlt.2023.3253215
|View full text |Cite
|
Sign up to set email alerts
|

Integration of Prediction Scores From Various Automated Essay Scoring Models Using Item Response Theory

Abstract: In automated essay scoring (AES), essays are automatically graded without human raters. Many AES models based on various manually designed features or various architectures of deep neural networks have been proposed over the past few decades. Each AES model has unique advantages and characteristics. Therefore, rather than using a single AES model, appropriate integration of predictions from various AES models is expected to achieve higher scoring accuracy. In the present paper, we propose a method that uses it… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 88 publications
0
0
0
Order By: Relevance
“…In such cases, it is not easy to deduce which model should be used for the final AES. Consequently, instead of relying on a single AES model, incorporating predictions from multiple AES models in a suitable manner is expected to enhance scoring accuracy (Sagi & Rokach, 2018; Uto et al, 2023). An easy way to do this is simply by averaging the scores from the individual AES models or by adopting the majority vote.…”
Section: Automated Content Scoringmentioning
confidence: 99%
See 1 more Smart Citation
“…In such cases, it is not easy to deduce which model should be used for the final AES. Consequently, instead of relying on a single AES model, incorporating predictions from multiple AES models in a suitable manner is expected to enhance scoring accuracy (Sagi & Rokach, 2018; Uto et al, 2023). An easy way to do this is simply by averaging the scores from the individual AES models or by adopting the majority vote.…”
Section: Automated Content Scoringmentioning
confidence: 99%
“…In their recent work, Uto et al (2023) used a generalized many-facet Rasch model to integrate prediction scores from various AES models that were trained based on the ASAP dataset (Hamner et al, 2012) to predict a holistic score of writing quality. Their results showed that the proposed method achieved higher accuracy than that of individual AES models and conventional score-integration methods.…”
Section: Automated Content Scoringmentioning
confidence: 99%
“…Liu et al [29] designed a Two-Stage Learning Framework (TSLF) to extract semantic features, fluency features, and relevance features through the neural network model, fusing artificial features for scoring. Uto et al [30] proposed a fusion method that utilizes item response theory to consider differences in scoring behavioral characteristics and integrate prediction scores from various AES models.…”
Section: Aes Based On Hybrid Modelmentioning
confidence: 99%
“…The AES tools' correlations and agreement with human raters have become fairly high (Ifenthaler, 2022;Link & Koltovskala, 2023;Warschauer & Ware, 2006). State-of-the-art models report quadratic weighted Kappas ranging from .57 to .80, with most in the low .70's, evidencing substantial agreement between the models and human raters (Beseiso et al, 2021;Uto et al, 2023). Many of these studies highlight the results of adjacent agreement between humans and AES systems rather than those of exact agreement (Ifenthaler & Dikli, 2015).…”
Section: Introductionmentioning
confidence: 99%