“…Similarly, the recent advances in using and understanding the Nyström method (Williams and Seeger, 2001), which is one of the most popular sparse approximations in kernel methods, have been made independently to those of sparse GP approximations. The majority of these advances focus on an efficient approximation of the kernel matrix (e.g., Drineas and Mahoney, 2005;Belabbas and Wolfe, 2009;Gittens and Mahoney, 2016;Derezinski et al, 2020) or empirical risk minimization in the RKHS with a reduced basis (e.g, Bach, 2013;El Alaoui and Mahoney, 2015;Rudi et al, 2015Rudi et al, , 2017Meanti et al, 2020). This separation of two lines of research are arguably due to the difference in the notations and modeling philosophies of GPs and kernel methods.…”