Nonconvex and nonsmooth optimization problems are frequently encountered in much of statistics, business, science and engineering, but they are not yet widely recognized as a technology in the sense of scalability. A reason for this relatively low degree of popularity is the lack of a well developed system of theory and algorithms to support the applications, as is the case for its convex counterpart. This paper aims to take one step in the direction of disciplined nonconvex and nonsmooth optimization. In particular, we consider in this paper some constrained nonconvex optimization models in block decision variables, with or without coupled affine constraints. In the absence of coupled constraints, we show a sublinear rate of convergence to an ǫ-stationary solution in the form of variational inequality for a generalized conditional gradient method, where the convergence rate is dependent on the Hölderian continuity of the gradient of the smooth part of the objective. For the model with coupled affine constraints, we introduce corresponding ǫ-stationarity conditions, and apply two proximal-type variants of the ADMM to solve such a model, assuming the proximal ADMM updates can be implemented for all the block variables except for the last block, for which either a gradient step or a majorization-minimization step is implemented. We show an iteration complexity bound of O(1/ǫ 2 ) to reach an ǫ-stationary solution for both algorithms. Moreover, we show that the same iteration complexity of a proximal BCD method follows immediately. Numerical results are provided to illustrate the efficacy of the proposed algorithms for tensor robust PCA.
The alternating direction method of multipliers (ADMM) has been widely used for solving structured convex optimization problems. In particular, the ADMM can solve convex programs that minimize the sum of N convex functions with N -block variables linked by some linear constraints. While the convergence of the ADMM for N = 2 was well established in the literature, it remained an open problem for a long time whether or not the ADMM for N ≥ 3 is still convergent. Recently, it was shown in [5] that without further conditions the ADMM for N ≥ 3 may actually fail to converge. In this paper, we show that under some easily verifiable and reasonable conditions the global linear convergence of the ADMM when N ≥ 3 can still be assured, which is important since the ADMM is a popular method for solving large scale multi-block optimization models and is known to perform very well in practice even when N ≥ 3. Our study aims to offer an explanation for this phenomenon.
A model for brittle fracture by transgranular cleavage cracking is presented based on the application of weakest link statistics to the critical microstructural fracture mechanisms. The model _permits prediction of the macroscopic fracture toughness, Kic, in single phase microstructures containing a known distribution of particles, and defines the critical distance from the crack tip at which the initial cracking event is most probab 1 e. The mode 1 is deve 1 oped for unstab 1 e fracture ahead of a sharp crack considering both linear elastic and nonlinear elastic ("elastic/plastic") crack tip stress fields. Predictfons are evaluated by comparision with experimental results on the low temperature flow and fracture behavior of a low carbon mild steel with a simple ferrite/grain boundary carbide microstructure. This report was done with support from the Department of Energy. Any conclusions or opinions expressed in this report represent solely those of the author(s) and not necessarily those of The Regents of the University of California, the Lawrence Berkeley Laboratory or the Department of Energy. Reference to a company or product name does not imply approval or recommendation of the product by the University of California or the U.S. Department of Energy to the exclusion of others that may be suitable.
Given an undirected graph G = (N , E) of agents N = {1, . . . , N } connected with edges in E, we study how to compute an optimal decision on which there is consensus among agents and that minimizes the sum of agent-specific private convex composite functions {Φ i } i∈N while respecting privacy requirements, where Φ i ξ i + f i belongs to agent-i. Assuming only agents connected by an edge can communicate, we propose a distributed proximal gradient method DPGA for consensus optimization over both unweighted and weighted static (undirected) communication networks. In one iteration, each agent-i computes the prox map of ξ i and gradient of f i , and this is followed by local communication with neighboring agents. We also study its stochastic gradient variant, SDPGA, which can only access to noisy estimates of ∇f i at each agent-i. This computational model abstracts a number of applications in distributed sensing, machine learning and statistical inference. We show ergodic convergence in both sub-optimality error and consensus violation for DPGA and SDPGA with rates O(1/t) and O(1/ √ t),This computational setting, i.e., decentralized consensus optimization, appears as a generic model for various applications in signal processing, e.g., [2]-[6], machine learning, e.g., [7]- [9] and statistical inference, e.g., [10], [11]. Clearly, (3) can also be solved in a "centralized" fashion by communicating all the private functions Φ i to a central node, and solving the overall problem at this node. However, such an approach can be very expensive both from communication January 3, 2017 DRAFT solutionx = [x i ] i∈N such that its consensus violation max{ x i −x j 2 : (i, j) ∈ E} ≤ within O(1) iterations; and its suboptimality is bounded from above as i∈N Φ i (x i ) − F * ≤ within O(1/ 2 ) iterations; however, since the step size is constant, neither suboptimality nor consensus errors are guaranteed to decrease further. Although these algorithms are for more general problems and assume mere convexity on each Φ i , this generality comes at the cost of O(1/ 2 ) complexity bounds, and they also tend to be very slow in practice. On the other extreme, under much stronger conditions: assuming each Φ i is smooth and has bounded gradients, Jakovetic et al. [19] developed a fast distributed gradient method D-NC with O(log(1/ )/ √ ) convergence rate in communication rounds. For the quadratic loss, which is one of the most commonly used loss functions, bounded gradient assumption does not hold. In terms of distributed applicability, D-NC requires all the nodes N to agree on a doubly stochastic weight matrix W ∈ R |N |×|N | ; it also assumes that the second largest eigenvalue of W ∈ R |N |×|N | is known globally among all the nodes -this is not attainable for very large scale fully distributed networks. D-NC is a two-loop algorithm: for each outer loop k, each node computes their gradients once, and it is followed by O(log(k)) communication rounds. In the rest, we briefly discuss those algorithms that balance the trade-off between the iterati...
Dc magnetron sputtered Ni-Mn and Ni-Mn-Cr films are demonstrated to exhibit strong and thermally stable antiferromagnetism, as well as high corrosion resistance. For a 25.2 nm thick 53.3 Ni-46.7 Mn (in atomic percent) film deposited on top of a 28.5 nm thick 81 Ni-19 Fe film, a unidirectional anisotropy field (HUA) of 120.6 Oe is obtained at room temperature after annealing in vacuum. The equivalent interfacial exchange coupling energy (JK) is 0.27 erg/cm2, three times higher than that of bilayer Ni-Fe/50Fe-50Mn films. This strong exchange coupling appears correlated with the presence of an antiferromagnetic θ (NiMn) phase with a CuAu-I-type ordered face-centered-tetragonal structure. The blocking temperature, at which the exchange coupling disappears, is higher than 400 °C. The Cr addition to the Ni-Mn film dilutes the exchange coupling, but the JK for the Cr content ≤10.7 at. %, is still higher than that of the Ni-Fe/Fe-Mn films. Both Ni-Mn and Ni-Mn-Cr films exhibit corrosion behaviors much better than the Fe-Mn film and comparable to the Ni-Fe film. The films are proposed as longitudinal bias layers for the stabilization of magnetoresistive read sensors.
Spin valves are widely studied due to their application as magnetoresistive material in magnetic recording heads and other magnetic field sensors. An important film property is the interlayer coupling field (called offset field Ho or ferromagnetic coupling field Hf). It has been shown that the Néel model for orange-peel coupling can be applied successfully to describe this interlayer coupling. The waviness associated with the developing granular structure is thereby taken as the relevant waviness. The original Néel model describes the ferromagnetic magnetostatic interaction between two ferromagnetic layers, of infinite thickness, separated by a nonmagnetic spacer with a correlated interface waviness. In this article, this physical picture is refined to account for the effect of the finite thickness of the magnetic films in a spin valve. Magnetic poles created at the outer surfaces of the magnetic layers result in an antiferromagnetic interaction with the poles at the inner surface of the opposite layer. A simple model is presented for the different interactions in a top spin valve (columnar structure with cumulative waviness on a flat substrate) and for a bottom spin valve (columnar structure with conformal waviness on a way substrate). Comparison to experimental data, shows that the free and pinned layer thickness dependence can be understood from this refined picture.
We consider nonconvex-concave minimax problems, min x max y∈Y f (x, y), where f is nonconvex in x but concave in y. The standard algorithm for solving this problem is the celebrated gradient descent ascent (GDA) algorithm, which has been widely used in machine learning, control theory and economics. However, despite the solid theory for the convex-concave setting, GDA can converge to limit cycles or even diverge in a general setting. In this paper, we present a nonasymptotic analysis of GDA for solving nonconvex-concave minimax problems, showing that GDA can find a stationary point of the function Φ(•) := max y∈Y f (•, y) efficiently. To the best our knowledge, this is the first theoretical guarantee for GDA in this setting, shedding light on its practical performance in many real applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.