2018
DOI: 10.1111/jedm.12172
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Basic Writing Processes From Keystroke Logs

Abstract: The goal of this study is to model pauses extracted from writing keystroke logs as a way of characterizing the processes students use in essay composition. Low-level timing data were modeled, the interkey interval and its subtype, the intraword duration, thought to reflect processes associated with keyboarding skills and composition fluency. Heavy-tailed probability distributions (lognormal and stable distributions) were fit to individual students' data. Both density functions fit reasonably well, and estimate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
26
0
2

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 31 publications
(30 citation statements)
references
References 26 publications
(31 reference statements)
2
26
0
2
Order By: Relevance
“…In terms of the low magnitudes between mixture modeling parameters and the human scores, this is similar to Guo et al’s (2018) finding where they used estimated stable distribution parameters to correlate with human scores. The low magnitudes on the correlations are not surprising, because the goal of the present research is not about replacing human score with the extracted process features.…”
Section: Discussionsupporting
confidence: 75%
See 1 more Smart Citation
“…In terms of the low magnitudes between mixture modeling parameters and the human scores, this is similar to Guo et al’s (2018) finding where they used estimated stable distribution parameters to correlate with human scores. The low magnitudes on the correlations are not surprising, because the goal of the present research is not about replacing human score with the extracted process features.…”
Section: Discussionsupporting
confidence: 75%
“…The advantage of researching the distributions of response time (e.g., pause data) is notable; that is, the parametric methods can provide meaningful summary statistics to represent each student. However, thus far, the only work incorporating a distribution-based analysis in large-scale log data was conducted by Guo et al (2018) . While their work was a promising effort to quantify the writing process, the results were ambiguous, in terms of the meaning of the parameters extracted.…”
Section: Introductionmentioning
confidence: 99%
“…Our own continuing work to advance the state of the art includes investigating the statistical properties of keystroke data (Guo, Deane, van Rijn, Zhang, & Bennett, ), trying to distinguish keyboarding skills from higher level writing skills (Deane et al., ), developing approaches to defining bursts of text personalized to the individual (as opposed to using a uniform threshold for all individuals) (Zhang, Hao, Li, & Deane, ), employing keystroke data to infer whether processes are consistent with the construct the test is designed to measure (Zhang, Zou, et al., ), and trying to model process durations and transitions (e.g., from text generation to editing) to see if they might suggest meaningful demographic group differences (Guo, Zhang, Deane, & Bennett, ). Ultimately, we and others will need to investigate and devise ways to report writing‐process information to teachers and students so that such information contributes to the development of more effective writing and writers.…”
Section: Discussionmentioning
confidence: 99%
“…Our intention was to develop an estimator that has little costs in regular data and thus can be used on a routine basis. Although our approach was based on the response times in an item, one could of course use different process data for weighting, like, for example, data from keystroke logs (Guo, Deane, van Rijn, Zhang, and Bennett, 2018). There are also some limitations.…”
Section: Discussionmentioning
confidence: 99%