“…Tran et al [
21] prove their result using a canonical hierarchical normal model with conjugate priors, and in fact the objective function in (8) arises from the posterior predictive distribution in the model. Further details on the model used in [
21] are given in Appendix A.1. Another way to reinterpret their algorithm is as follows:
Using a training dataset, fit a linear regression of Y on T and X .
Based on the model fitted in step 1., compute predicted/fitted values of Y using the test dataset.
Solve equation (8) using the test data.
Again, there is an implicit assumption that the training and test datsets will come from the same distribution.…”