Stable Prediction across Unknown Environments

Kuang, Kun; Cui, Peng; Athey, Susan; Xiong, Ruoxuan; Li, Bo

doi:10.1145/3219819.3220082

Cited by 126 publications

(117 citation statements)

References 21 publications

Supporting

Mentioning

116

Contrasting

Order By: Relevance

“…A qualified model for student performance prediction should have good results from both regression and classification perspectives. In this paper, we evaluated the prediction performance of all models using four widely-used metrics in the domain [13], [24], [49], [50], [56]. From the regression perspective, we selected Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to quantify the distance between predicted scores and the actual ones.…”

Section: Evaluation Metricsmentioning

confidence: 99%

EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction

Liu

Huang

Yin

et al. 2021

IEEE Trans. Knowl. Data Eng.

281

View full text Add to dashboard Cite

For offering proactive services (e.g., personalized exercise recommendation) to the students in computer supported intelligent education, one of the fundamental tasks is predicting student performance (e.g., scores) on future exercises, where it is necessary to track the change of each student's knowledge acquisition during her exercising activities. Unfortunately, to the best of our knowledge, existing approaches can only exploit the exercising records of students, and the problem of extracting rich information existed in the materials (e.g., knowledge concepts, exercise content) of exercises to achieve both more precise prediction of student performance and more interpretable analysis of knowledge acquisition remains underexplored. To this end, in this paper, we present a holistic study of student performance prediction. To directly achieve the primary goal of performance prediction, we first propose a general Exercise-Enhanced Recurrent Neural Network (EERNN) framework by exploring both student's exercising records and the text content of corresponding exercises. In EERNN, we simply summarize each student's state into an integrated vector and trace it with a recurrent neural network, where we design a bidirectional LSTM to learn the encoding of each exercise from its content. For making final predictions, we design two implementations on the basis of EERNN with different prediction strategies, i.e., EERNNM with Markov property and EERNNA with Attention mechanism. Then, to explicitly track student's knowledge acquisition on multiple knowledge concepts, we extend EERNN to an explainable Exercise-aware K nowledge T racing (EKT) framework by incorporating the knowledge concept information, where the student's integrated state vector is now extended to a knowledge state matrix. In EKT, we further develop a memory network for quantifying how much each exercise can affect the mastery of students on multiple knowledge concepts during the exercising process. Finally, we conduct extensive experiments and evaluate both EERNN and EKT frameworks on a large-scale real-world data. The results in both general and cold-start scenarios clearly demonstrate the effectiveness of two frameworks in student performance prediction as well as the superior interpretability of EKT.

show abstract

Section: Evaluation Metricsmentioning

confidence: 99%

EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction

Liu

Huang

Yin

et al. 2021

IEEE Trans. Knowl. Data Eng.

281

View full text Add to dashboard Cite

show abstract

“…The joint distribution of features and outcomes on (X, Y ) can change across environments: P e XY = P e XY for e, e ∈ E. In this paper, our goal is to learn a predictive model for stable prediction with model misspecification and agnostic distribution shift. To measure its performance on stable prediction problem, we adopt the Average Error and Stability Error in (Kuang et al 2018) as:…”

Section: Stable Prediction Problemmentioning

confidence: 99%

“…where |E| refers to the number of test environments, and RM SE(D e ) represents the Root Mean Square Error of a predictive model on dataset D e . Actually, Average Error and Stability Error refer to the mean and variance of predictive error over all possible environments e ∈ E. Then, the stable prediction problem (Kuang et al 2018) is defined as: Problem 1 (Stable Prediction) Given one training environment e ∈ E with dataset D e = (X e , Y e ), the task is to learn a predictive model to predict across unknown environment E with not only small Average Error but also small Stability Error.…”

Section: Stable Prediction Problemmentioning

confidence: 99%

“…Moreover, they do not consider the interaction of distribution shift and model misspecification. Recently, some papers (Kuang et al 2018;Shen et al 2018) were proposed to address stable prediction problem using methods drawn from the literature on causal inference, achieving improved performance. But they did not consider the model misspecification and their algorithms were restricted to the predictive setting with binary predictors and binary response variable.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Stable Prediction with Model Misspecification and Agnostic Distribution Shift

Kuang

Xiong

Cui

et al. 2020

AAAI

Self Cite

View full text Add to dashboard Cite

For many machine learning algorithms, two main assumptions are required to guarantee performance. One is that the test data are drawn from the same distribution as the training data, and the other is that the model is correctly specified. In real applications, however, we often have little prior knowledge on the test data and on the underlying true model. Under model misspecification, agnostic distribution shift between training and test data leads to inaccuracy of parameter estimation and instability of prediction across unknown test data. To address these problems, we propose a novel Decorrelated Weighting Regression (DWR) algorithm which jointly optimizes a variable decorrelation regularizer and a weighted regression model. The variable decorrelation regularizer estimates a weight for each sample such that variables are decorrelated on the weighted training data. Then, these weights are used in the weighted regression to improve the accuracy of estimation on the effect of each variable, thus help to improve the stability of prediction across unknown test data. Extensive experiments clearly demonstrate that our DWR algorithm can significantly improve the accuracy of parameter estimation and stability of prediction with model misspecification and agnostic distribution shift.

show abstract

“…A fundamental requirement for out-of-domain transfer learning is to mitigate the biases from the pretraining data [49], which may be useful for the in-domain testing but harmful for out-of-domain testing [19] due to the spurious correlation [34]. To verify such existence of the correlation biases, we follow [49] to conduct a toy experiment on Conceptual Caption dataset.…”

Section: Introductionmentioning

confidence: 99%

DeVLBert

Zhang

Jiang²,

Wang

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

Self Cite

View full text Add to dashboard Cite

In this paper, we propose to investigate the problem of out-ofdomain visio-linguistic pretraining, where the pretraining data distribution differs from that of downstream data on which the pretrained model will be fine-tuned. Existing methods for this problem are purely likelihood-based, leading to the spurious correlations and hurt the generalization ability when transferred to out-of-domain downstream tasks. By spurious correlation, we mean that the conditional probability of one token (object or word) given another one can be high (due to the dataset biases) without robust (causal) relationships between them. To mitigate such dataset biases, we propose a Deconfounded Visio-Linguistic Bert framework, abbreviated as DeVLBert, to perform intervention-based learning. We borrow the idea of the backdoor adjustment from the research field of causality and propose several neural-network based architectures for Bert-style out-of-domain pretraining. The quantitative results on three downstream tasks, Image Retrieval (IR), Zero-shot IR, and Visual Question Answering, show the effectiveness of DeVLBert by boosting generalization ability. CCS CONCEPTS • Computing methodologies → Transfer learning.

show abstract

Stable Prediction across Unknown Environments

Cited by 126 publications

References 21 publications

EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction

EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction

Stable Prediction with Model Misspecification and Agnostic Distribution Shift

DeVLBert

Contact Info

Product

Resources

About