Adaptive Step Sizes in Variance Reduction via Regularization

Li, Bingcong; Giannakis, Georgios B.

doi:10.48550/arxiv.1910.06532

Cited by 1 publication

(1 citation statement)

References 11 publications

(19 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tan et al [20] introduced SGD-BB and SVRG-BB, which use the BB method to determine step size for SGD and SVRG, respectively. Li et al [21] applied the BB method to calculate the step size for SARAH, and other researchers such as, Liu et al [22] and Yang et al [23][24][25] incorporated the BB method to compute step size for the variants of SGD type algorithms. Sampling strategies play a crucial role in improving the performance of SGD during training.…”

Section: Introductionmentioning

confidence: 99%

A Mini-batch Stochastic Recursive Gradient Method with Barzilai-Borwein Step-Size for Machine Learning

Yang

Wang

Zheng

et al. 2023

Preprint

View full text Add to dashboard Cite

As a mini-batch version of the SARAH algorithm, the MB-SARAH algorithm has received extensive attention due to its simple recursive scheme for updating stochastic gradient estimates. In this paper, we give a modification of the MB-SARAH method via cooperating with the BB step-size, shorted to MB-SARAH-BB. The MB-SARAH-BB combines some advantages of both MB-SARAH and BB methods, providing robustness in selecting initial step size during the optimization process. In the framework of MB-SARAH-BB, we propose a novel implementable method, Ada-MB-SARAH-BB, which utilizes adaptive probability for sampling in the mini-batch stochastic recursive gradient computation during the inner loop iteration. We establish the linear convergence of the MB-SARAH-BB and Ada-MB-SARAH-BB methods under some mild assumptions. Numerical experiments on standard machine learning datasets demonstrate that, the MB-SARAH-BB is effective and more competitive than the recent successful stochastic gradient methods. In addition, numerical experiments also demonstrate that the performance of Ada-MB-SARAH-BB is generally better than or comparable to MB-SARAH-BB method. MSC Classification: 90C15 · 90C25 · 90C30

show abstract