Convergence and Dynamical Behavior of the ADAM Algorithm for Nonconvex Stochastic Optimization

Barakat, Anas; Bianchi, Pascal

doi:10.1137/19m1263443

Cited by 52 publications

(45 citation statements)

References 20 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the experiment, all weights are initialized randomly using N(0, 1) Gaussian distribution. According to the graphics card and memory conditions, set batch_size to 1. e initial learning rate is set to 0.001 using Adam optimization algorithm [30]. e Adam optimization algorithm calculates the gradient's first-order moment estimation and second-order moment estimation to adapt the learning rate.…”

Section: Experimental Setup and Environmentmentioning

confidence: 99%

A Novel Brain Image Segmentation Method Using an Improved 3D U-Net Model

Yang

2021

Scientific Programming

View full text Add to dashboard Cite

Medical image segmentation (IS) is a research field in image processing. Deep learning methods are used to automatically segment organs, tissues, or tumor regions in medical images, which can assist doctors in diagnosing diseases. Since most IS models based on convolutional neural network (CNN) are two-dimensional models, they are not suitable for three-dimensional medical imaging. On the contrary, the three-dimensional segmentation model has problems such as complex network structure and large amount of calculation. Therefore, this study introduces the self-excited compressed dilated convolution (SECDC) module on the basis of the 3D U-Net network and proposes an improved 3D U-Net network model. In the SECDC module, the calculation amount of the model can be reduced by 1 × 1 × 1 convolution. Combining normal convolution and cavity convolution with an expansion rate of 2 can dig out the multiview features of the image. At the same time, the 3D squeeze-and-excitation (3D-SE) module can realize automatic learning of the importance of each layer. The experimental results on the BraTS2019 dataset show that the Dice coefficient and other indicators obtained by the model used in this paper indicate that the overall tumor can reach 0.87, the tumor core can reach 0.84, and the most difficult to segment enhanced tumor can reach 0.80. From the evaluation indicators, it can be analyzed that the improved 3D U-Net model used can greatly reduce the amount of data while achieving better segmentation results, and the model has better robustness. This model can meet the clinical needs of brain tumor segmentation methods.

show abstract

Section: Experimental Setup and Environmentmentioning

confidence: 99%

A Novel Brain Image Segmentation Method Using an Improved 3D U-Net Model

Yang

2021

Scientific Programming

View full text Add to dashboard Cite

show abstract

“…The idea of the adaptive step size is taken from the Adagrad algorithm introduced in [26]. Analysis of such algorithms for nonconvex objectives was proposed in [36,55,3,25] in the stochastic and smooth setting. To our knowledge the combination of adaptive step sizes with incremental methods has not been considered.…”

Section: Relation To Existing Litteraturementioning

confidence: 99%

“…Our nonsmooth convergence analysis relies on the ODE method, see [37] with many subsequent developments [6,33,7,17,3]. In particular we build uppon a nonsmooth ODE formulation, differential inclusions [22,2].…”

Section: Relation To Existing Litteraturementioning

confidence: 99%

Incremental Without Replacement Sampling in Nonconvex Optimization

Pauwels

2021

J Optim Theory Appl

View full text Add to dashboard Cite

Minibatch decomposition methods for empirical risk minimization are commonly analysed in a stochastic approximation setting, also known as sampling with replacement. On the other hands modern implementations of such techniques are incremental: they rely on sampling without replacement, for which available analysis are much scarcer. We provide convergence guaranties for the latter variant by analysing a versatile incremental gradient scheme. For this scheme, we consider constant, decreasing or adaptive step sizes. In the smooth setting we obtain explicit complexity estimates in terms of epoch counter. In the nonsmooth setting we prove that the sequence is attracted by solutions of optimality conditions of the problem.

show abstract

“…Our starting point is a generic non-autonomous Ordinary Differential Equation (ODE) introduced by Belotto da Silva and Gazeau [9] (see also [8] for Adam), depicting the continuous-time versions of the aforementioned florilegium of algorithms. The solutions to the ODE are shown to converge to the set of critical points of F .…”

Section: Introductionmentioning

confidence: 99%

Stochastic optimization with momentum: Convergence, fluctuations, and traps avoidance

Barakat

Bianchi

Hachem

et al. 2021

Electron. J. Statist.

Self Cite

View full text Add to dashboard Cite

In this paper, a general stochastic optimization procedure is studied, unifying several variants of the stochastic gradient descent such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient algorithm (S-NAG), and the widely used Adam algorithm. The algorithm is seen as a noisy Euler discretization of a nonautonomous ordinary differential equation, recently introduced by Belotto da Silva and Gazeau, which is analyzed in depth. Assuming that the objective function is non-convex and differentiable, the stability and the almost sure convergence of the iterates to the set of critical points are established. A noteworthy special case is the convergence proof of S-NAG in a nonconvex setting. Under some assumptions, the convergence rate is provided under the form of a Central Limit Theorem. Finally, the non-convergence of the algorithm to undesired critical points, such as local maxima or saddle points, is established. Here, the main ingredient is a new avoidance of traps result for non-autonomous settings, which is of independent interest.

show abstract

Convergence and Dynamical Behavior of the ADAM Algorithm for Nonconvex Stochastic Optimization

Cited by 52 publications

References 20 publications

A Novel Brain Image Segmentation Method Using an Improved 3D U-Net Model

A Novel Brain Image Segmentation Method Using an Improved 3D U-Net Model

Incremental Without Replacement Sampling in Nonconvex Optimization

Stochastic optimization with momentum: Convergence, fluctuations, and traps avoidance

Contact Info

Product

Resources

About