Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1100
|View full text |Cite
|
Sign up to set email alerts
|

TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring

Abstract: Existing automated essay scoring (AES) models rely on rated essays for the target prompt as training data. Despite their successes in prompt-dependent AES, how to effectively predict essay ratings under a prompt-independent setting remains a challenge, where the rated essays for the target prompt are not available. To close this gap, a two-stage deep neural network (TDNN) is proposed. In particular, in the first stage, using the rated essays for nontarget prompts as the training data, a shallow model is learne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
59
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 69 publications
(62 citation statements)
references
References 24 publications
0
59
0
Order By: Relevance
“…Relatively few researchers have made progress on generic essay scoring: Phandi et al (2015) introduces a Bayesian regression approach that extracts N -gram features then capitalizes on correlated features across prompts. Jin et al (2018) shows promising prompt-independent results using an LSTM architecture with surface and part-of-speech N -gram inputs, underperforming prompt-specific models by relatively small margins across all ASAP datasets. But in implementations, much of the work of practitioners is based on workarounds for prompt-specific models; , for instance, describes psychometric techniques for measuring generic writing ability across a small sample of known prompts.…”
Section: Domain Transfermentioning
confidence: 99%
“…Relatively few researchers have made progress on generic essay scoring: Phandi et al (2015) introduces a Bayesian regression approach that extracts N -gram features then capitalizes on correlated features across prompts. Jin et al (2018) shows promising prompt-independent results using an LSTM architecture with surface and part-of-speech N -gram inputs, underperforming prompt-specific models by relatively small margins across all ASAP datasets. But in implementations, much of the work of practitioners is based on workarounds for prompt-specific models; , for instance, describes psychometric techniques for measuring generic writing ability across a small sample of known prompts.…”
Section: Domain Transfermentioning
confidence: 99%
“…We also want to emphasize that extraction of a massive number of features vs. 23 features adds to time complexity as well. We also compared our system to work published recently in 2018; the system is called TDNN [87] which uses a two-layer neural network to reach an average QWK score of 0.7365, i.e., 7.1% lesser than us. To the best of our knowledge, our system uses the minimum number of features compared to existing systems with better results.…”
Section: G Results and Discussionmentioning
confidence: 99%
“…Relatively few researchers have made progress on generic essay scoring: Phandi et al (2015) introduces a Bayesian regression approach that extracts N -gram features then capitalizes on correlated features across prompts. Jin et al (2018) shows promising prompt-independent results using an LSTM architecture with surface and part-of-speech N -gram inputs, underperforming prompt-specific models by relatively small margins across all ASAP datasets. But in implementations, much of the work of practitioners is based on workarounds for prompt-specific models; Wilson et al (2019), for instance, describes psychometric techniques for measuring generic writing ability across a small sample of known prompts.…”
Section: Domain Transfermentioning
confidence: 99%