“…Standard RandNLA guarantees such as the subspace embedding are sufficient (although not necessary) to ensure that Ĥt provides a good enough approximation to enable accelerated local convergence in time Õ(nd). These approaches have also been extended to distributed settings via RMT-based model averaging, with applications in ensemble methods, distributed optimization, and federated learning [110,109,144,48,92]. Further RandNLA-based Newton-type methods include: Subsampled Newton [75,159,18,17]; Hessian approximations via randomized Taylor expansion [1] and low-rank approximation [77,55]; Hessian diagonal/trace estimates via Hutchinson's method [136] and Stochastic Lanczos Quadrature, particularly for non-convex problems, e.g., PyHessian [176], AdaHessian [177]; and finally Stochastic Quasi-Newton type methods [106,137].…”