We propose a novel technique for improving the stochastic gradient descent (SGD) method to train deep networks, which we term pbSGD. The proposed pbSGD method simply raises the stochastic gradient to a certain power elementwise during iterations and introduces only one additional parameter, namely, the power exponent (when it equals to 1, pbSGD reduces to SGD). We further propose pbSGD with momentum, which we term pbSGDM. The main results of this paper present comprehensive experiments on popular deep learning models and benchmark datasets. Empirical results show that the proposed pbSGD and pbSGDM obtain faster initial training speed than adaptive gradient methods, comparable generalization ability with SGD, and improved robustness to hyper-parameter selection and vanishing gradients. pbSGD is essentially a gradient modifier via a nonlinear transformation. As such, it is orthogonal and complementary to other techniques for accelerating gradient-based optimization such as learning rate schedules. Finally, we show convergence rate analysis for both pbSGD and pbSGDM methods. The theoretical rates of convergence match the best known theoretical rates of convergence for SGD and SGDM methods on nonconvex functions.
The probabilistic characteristics of daily wind speed are not well captured by simple density functions such as Normal or Weibull distribuions as suggested by the existing literature. The unmodeled uncertainties can cause unknown influences on the power system operation. In this paper, we develop a new stochastic scheme for the probabilistic optimal power flow (POPF) problem, which can cope with arbitrarily complex wind speed distributions and also take into account the correlation of different wind farms. A multivariate Gaussian mixture model (GMM) is employed to approximate actual wind speed distributions from multiple wind farms. Furthermore, we propose to adopt the Markov Chain Monte Carlo (MCMC) sampling technique to deliver wind speed samples as the input of POPF. We also novelly integrate a Sobol-based quasi-Monte Carlo (QMC) technique into the MCMC sampling process to obtain a faster convergence rate. The IEEE 14-and 118-bus benchmark systems with additional wind farms are used to examine the effectiveness of the proposed POPF scheme.Index Terms-Probabilistic optimal power flow, Gaussian mixture model, Markov chain Monte Carlo, quasi-Monte Carlo, uncertainty.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.