Background
Prediction models inform many medical decisions, but their performance often deteriorates over time. Several discrete-time update strategies have been proposed in the literature, including model recalibration and revision. However, these strategies have not been compared in the dynamic updating setting.
Methods
We used post-lung transplant survival data during 2010-2015 and compared the Brier Score (BS), discrimination, and calibration of the following update strategies: (1) never update, (2) update using the closed testing procedure proposed in the literature, (3) always recalibrate the intercept, (4) always recalibrate the intercept and slope, and (5) always refit/revise the model. In each case, we explored update intervals of every 1, 2, 4, and 8 quarters. We also examined how the performance of the update strategies changed as the amount of old data included in the update (i.e., sliding window length) increased.
Results
All methods of updating the model led to meaningful improvement in BS relative to never updating. More frequent updating yielded better BS, discrimination, and calibration, regardless of update strategy. Recalibration strategies led to more consistent improvements and less variability over time compared to the other updating strategies. Using longer sliding windows did not substantially impact the recalibration strategies, but did improve the discrimination and calibration of the closed testing procedure and model revision strategies.
Conclusions
Model updating leads to improved BS, with more frequent updating performing better than less frequent updating. Model recalibration strategies appeared to be the least sensitive to the update interval and sliding window length.
The Brier score has been a popular measure of prediction accuracy for binary outcomes. However, it is not straightforward to interpret the Brier score for a prediction model since its value depends on the outcome prevalence. We decompose the Brier score into two components, the mean squares between the estimated and true underlying binary probabilities, and the variance of the binary outcome that is not reflective of the model performance. We then propose to modify the Brier score by removing the variance of the binary outcome, estimated via a general sliding window approach. We show that the new proposed measure is more sensitive for comparing different models through simulation. A standardized performance improvement measure is also proposed based on the new criterion to quantify the improvement of prediction performance. We apply the new measures to the data from the Breast Cancer Surveillance Consortium and compare the performance of predicting breast cancer risk using the models with and without its most important predictor.
Prediction modeling for clinical decision making is of great importance and needed to be updated frequently with the changes of patient population and clinical practice. Existing methods are either done in an ad hoc fashion, such as model recalibration or focus on studying the relationship between predictors and outcome and less so for the purpose of prediction. In this article, we propose a dynamic logistic state space model to continuously update the parameters whenever new information becomes available. The proposed model allows for both time-varying and time-invariant coefficients. The varying coefficients are modeled using smoothing splines to account for their smooth trends over time. The smoothing parameters are objectively chosen by maximum likelihood. The model is updated using batch data accumulated at prespecified time intervals, which allows for better approximation of the underlying binomial density function. In the simulation, we show that the new model has significantly higher prediction accuracy compared to existing methods. We apply the method to predict 1 year survival after lung transplantation using the United Network for Organ Sharing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.