Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing

Choi, Youngduck; Lee, Youngnam; Cho, Junghyun; Baek, Jineon; Kim, Byungsoo; Cha, Yeongmin; Shin, Dongmin; Bae, Chan; Heo, Jaewe

doi:10.1145/3386527.3405945

Cited by 112 publications

(58 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In each self-attention layer of SAKT, each query is an exercise embedding vector, and key and value are interaction embedding vectors. SAINT [2] is the first Transformer based knowledge tracing model which leverages encoder-decoder architecture composed of stacked self-attention layers. Unlike SAKT, SAINT gets separated streams of exercises and responses as inputs where a sequence of exercises are fed to the encoder, and a sequence of encoder outputs and responses are fed to the decoder.…”

Section: Related Workmentioning

confidence: 99%

“…In this subsection, we give a brief review of SAINT, a Separated Self-AttentIve Neural Knowledge Tracing. We refer the paper [2] for those who want to lean detailed aspects of SAINT. SAINT is a knowledge tracing model based on Transformer [25] The most fundamental part of SAINT is a multi-head attention layer.…”

Section: Saint+ 41 Saint: Separated Self-attentive Neural Knowledge Tracingmentioning

confidence: 99%

“…In this paper, we propose SAINT+, a successor of SAINT [2] which enhances knowledge tracing with temporal feature embeddings, and empirically verify the effectiveness of the model on EdNet dataset. SAINT is a Transformer [25] based knowledge tracing model that separately processes information of exercise and student response.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

Shin¹,

Shim²,

Yu³

et al. 2021

LAK21: 11th International Learning Analytics and Knowledge Conference

Self Cite

View full text Add to dashboard Cite

We propose SAINT+, a successor of SAINT which is a Transformer based knowledge tracing model that separately processes exercise information and student response information. Following the architecture of SAINT, SAINT+ has an encoder-decoder structure where the encoder applies self-attention layers to a stream of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to streams of response embeddings and encoder output. Moreover, SAINT+ incorporates two temporal feature embeddings into the response embeddings: elapsed time, the time taken for a student to answer, and lag time, the time interval between adjacent learning activities. We empirically evaluate the effectiveness of SAINT+ on EdNet, the largest publicly available benchmark dataset in the education domain. Experimental results show that SAINT+ achieves state-of-the-art performance in knowledge tracing with an improvement of 1.25% in area under receiver operating characteristic curve compared to SAINT, the current state-of-the-art model in EdNet dataset. CCS CONCEPTS• Computing methodologies → Artificial intelligence; Neural networks; • Applied computing → Interactive learning environments; • Social and professional topics → Student assessment.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Saint+ 41 Saint: Separated Self-attentive Neural Knowledge Tracingmentioning

confidence: 99%

See 1 more Smart Citation

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

Shin¹,

Shim²,

Yu³

et al. 2021

LAK21: 11th International Learning Analytics and Knowledge Conference

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, unlike the single matrix in MANN, DKVMN uses two matrices a key matrix for latent concepts and value matrix for student mastery level. More recently, transformers have also been applied to train DKTs [225][226][227]. Since they employ a self-attention mechanism, information encoded by transformers may also be interpreted.…”

Section: Other Applications Of Deep Neural Network Inmentioning

confidence: 99%

“…Random forests [174,197,199], k-nearest neighbours [204], Neural networks [177,200,[208][209][210]219], Bayesian networks [197,210,214], Regression models [174,177,178,188,198,205,206,213,216,217], Nave Bayes [174,214], Rule-based systems [197,207,209,221], Decision trees [174, 183, 196-198, 210, 214], Correlational analysis [220], Support vector machines [198,214], Matrix factorization & collaborative filtering [200-203, 211, 212], Cox proportional hazard model [215] Deep knowledge tracing -Deep knowledge tracing [31,222,223], Memory-augmented neural networks [224], Transformers [225][226][227] Chapter 6…”

Section: Video Watching Behaviourmentioning

confidence: 99%

Prediction of learning outcomes via clickstream data using machine learning

Ng¹

View full text Add to dashboard Cite

I have reviewed the content and presentation style of this thesis and declare it is free of plagiarism and of sufficient grammatical clarity to be examined. To the best of my knowledge, the research and writing are those of the candidate except as acknowledged in the Author Attribution Statement. I confirm that the investigations were conducted in accord with the ethics policies and integrity standards of Nanyang Technological University and that the research data are presented honestly and without prejudice.

show abstract

Deep Knowledge Tracing with Transformers

Yudelson

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing

Cited by 112 publications

References 9 publications

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

Prediction of learning outcomes via clickstream data using machine learning

Deep Knowledge Tracing with Transformers

Contact Info

Product

Resources

About