Brian Bullins scite author profile

We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of optimization problems including training a neural network and other non-convex objectives arising in machine learning.

show abstract

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Agarwal¹,

Bullins²,

Hazan³

2016

Preprint

View full text Add to dashboard Cite

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored due to the high cost of computing the second-order information. In this paper we develop second-order stochastic methods for optimization problems in machine learning that match the per-iteration cost of gradient based methods, and in certain settings improve upon the overall running time over popular first-order methods. Furthermore, our algorithm has the desirable property of being implementable in time linear in the sparsity of the input data.

show abstract

Online Control with Adversarial Disturbances

Agarwal¹,

Bullins²,

Hazan³

et al. 2019

Preprint

View full text Add to dashboard Cite

We study the control of a linear dynamical system with adversarial disturbances (as opposed to statistical noise). The objective we consider is one of regret: we desire an online control procedure that can do nearly as well as that of a procedure that has full knowledge of the disturbances in hindsight. Our main result is an efficient algorithm that provides nearly tight regret bounds for this problem. From a technical standpoint, this work generalizes upon previous work in two main aspects: our model allows for adversarial noise in the dynamics, and allows for general convex costs.

show abstract

Adaptive regularization with cubics on manifolds

et al. 2020

View full text Add to dashboard Cite

Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, nonconvex optimization. Akin to the trust-region method, its iterations can be thought of as approximate, safe-guarded Newton steps. For cost functions with Lipschitz continuous Hessian, ARC has optimal iteration complexity, in the sense that it produces an iterate with gradient smaller than ε in O(1/ε 1.5 ) iterations. For the same price, it can also guarantee a Hessian with smallest eigenvalue larger than − √ ε. In this paper, we study a generalization of ARC to optimization on Riemannian manifolds. In particular, we generalize the iteration complexity results to this richer framework. Our central contribution lies in the identification of appropriate manifold-specific assumptions that allow us to secure these complexity guarantees both when using the exponential map and when using a general retraction. A substantial part of the paper is devoted to studying these assumptions-relevant beyond ARC-and providing user-friendly sufficient conditions for them. Numerical experiments are encouraging. Keywords Optimization on manifolds • Complexity • Lipschitz regularity • Cubic regularization • Newton's method Mathematics Subject Classification 90C26 Nonconvex programming global optimization • 53Z99 Applications of differential geometry to sciences and engineering • 90C53 Methods of quasi-Newton type • 65K05 Numerical mathematical programming methods Authors are listed alphabetically.

show abstract

Finding Approximate Local Minima Faster than Gradient Descent

Agarwal¹,

Allen-Zhu²,

Bullins³

et al. 2016

Preprint

View full text Add to dashboard Cite

Higher-order methods for convex-concave min-max optimization and monotone variational inequalities

Bullins¹,

Lai²

2020

Preprint

View full text Add to dashboard Cite

We provide improved convergence rates for constrained convex-concave min-max problems and monotone variational inequalities with higher-order smoothness. In min-max settings where the p th -order derivatives are Lipschitz continuous, we give an algorithm HigherOrderMirrorProx that achieves an iteration complexity of O(1/T p+1 2 ) when given access to an oracle for finding a fixed point of a p th -order equation. We give analogous rates for the weak monotone variational inequality problem. For p > 2, our results improve upon the iteration complexity of the first-order Mirror Prox method of Nemirovski [2004] and the second-order method of Monteiro and Svaiter [2012]. We further instantiate our entire algorithm in the unconstrained p = 2 case.

show abstract

Spectral properties of modularity matrices

Bolla

Bullins

Chaturapruek

et al. 2015

Linear Algebra and its Applications

View full text Add to dashboard Cite

Optimal Methods for Higher-Order Smooth Monotone Variational Inequalities

Adil¹,

Bullins²,

Jambulapati³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this work, we present new simple and optimal algorithms for solving the variational inequality (VI) problem for p th -order smooth, monotone operators -a problem that generalizes convex optimization and saddle-point problems. Recent works (Bullins and Lai (2020), Lin and Jordan (2021), Jiang and Mokhtari ( 2022)) present methods that achieve a rate of O(ε −2/(p+1) ) for p ≥ 1, extending results by (Nemirovski ( 2004)) and (Monteiro and Svaiter ( 2012)) for p = 1, 2. A drawback to these approaches, however, is their reliance on a line search scheme. We provide the first p th -order method that achieves a rate of O(ε −2/(p+1) ). Our method does not rely on a line search routine, thereby improving upon previous rates by a logarithmic factor. Building on the Mirror Prox method of Nemirovski (2004), our algorithm works even in the constrained, non-Euclidean setting. Furthermore, we prove the optimality of our algorithm by constructing matching lower bounds. These are the first lower bounds for smooth MVIs beyond convex optimization for p > 1. This establishes a separation between solving smooth MVIs and smooth convex optimization, and settles the oracle complexity of solving p th -order smooth MVIs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Brian Bullins

Finding approximate local minima faster than gradient descent

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Online Control with Adversarial Disturbances

Adaptive regularization with cubics on manifolds

Finding Approximate Local Minima Faster than Gradient Descent

Higher-order methods for convex-concave min-max optimization and monotone variational inequalities

Spectral properties of modularity matrices

Optimal Methods for Higher-Order Smooth Monotone Variational Inequalities

Contact Info

Product

Resources

About