Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, in the first part of the paper, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on user-specific parameters. Of these, the first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met.In the second part of the paper, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution. The resulting adaptive steplength schemes are applied to three stochastic optimization problems. In particular, we observe that both schemes perform well in practice and display markedly less reliance on user-defined parameters.
I. INTRODUCTIONThe use of stochastic gradient and subgradient schemes for the solution of stochastic convex optimization problems has a long tradition, beginning with an iterative scheme, first proposed by Robbins and Monro [1], that relied primarily on noisy gradient observations. Research by Ermoliev and his coauthors [2]-[5] focused largely on quasigradient (subgradient) methods and considered a host of stochastic programming problems, amongst them being two-period recourse-based problems (see [6]). To accelerate the convergence of stochastic subgradient methods, ergodic sequences, arising from the averaging of iterates, have been employed in [7]-[10]. Often gradient computations are either costly or unavailable; in such instances, a finite-difference approximation of the gradient can be constructed as first observed by Kiefer and Wolfowitz [11]. While standard finite-difference techniques perturb one direction at a time to obtain gradient...