Wenlong Mou scite author profile

In this paper, we consider efficient differentially private empirical risk minimization from the viewpoint of optimization algorithms. For strongly convex and smooth objectives, we prove that gradient descent with output perturbation not only achieves nearly optimal utility, but also significantly improves the running time of previous state-of-the-art private optimization algorithms, for both -DP and ( , δ)-DP. For non-convex but smooth objectives, we propose an RRPSGD (Random Round Private Stochastic Gradient Descent) algorithm, which provably converges to a stationary point with privacy guarantee. Besides the expected utility bounds, we also provide guarantees in high probability form. Experiments demonstrate that our algorithm consistently outperforms existing method in both utility and running time.

show abstract

Efficient Private ERM for Smooth Objectives

Zhang¹,

Zheng²,

Mou³

et al. 2017

Preprint

View full text Add to dashboard Cite

Improved Bounds for Discretization of Langevin Diffusions: Near-Optimal Rates without Convexity

Mou¹,

Flammarion²,

Wainwright³

2019

Preprint

View full text Add to dashboard Cite

We present an improved analysis of the Euler-Maruyama discretization of the Langevin diffusion. Our analysis does not require global contractivity, and yields polynomial dependence on the time horizon. Compared to existing approaches, we make an additional smoothness assumption, and improve the existing rate from O(η) to O(η 2 ) in terms of the KL divergence. This result matches the correct order for numerical SDEs, without suffering from exponential time dependence. When applied to algorithms for sampling and learning, this result simultaneously improves all those methods based on Dalayan's approach.

show abstract

Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints

Mou¹,

Wang²,

Zhai³

et al. 2017

Preprint

View full text Add to dashboard Cite

Algorithm-dependent generalization error bounds are central to statistical learning theory. A learning algorithm may use a large hypothesis space, but the limited number of iterations controls its model capacity and generalization error. The impacts of stochastic gradient methods on generalization error for non-convex learning problems not only have important theoretical consequences, but are also critical to generalization errors of deep learning.In this paper, we study the generalization errors of Stochastic Gradient Langevin Dynamics (SGLD) with non-convex objectives. Two theories are proposed with non-asymptotic discretetime analysis, using Stability and PAC-Bayesian results respectively. The stability-based theory obtains a bound of O 1 n L √ βT k , where L is uniform Lipschitz parameter, β is inverse temperature, and T k is aggregated step sizes. For PAC-Bayesian theory, though the bound has a slower O(1/ √ n) rate, the contribution of each step is shown with an exponentially decaying factor by imposing 2 regularization, and the uniform Lipschitz constant is also replaced by actual norms of gradients along trajectory. Our bounds have no implicit dependence on dimensions, norms or other capacity measures of parameter, which elegantly characterizes the phenomenon of "Fast Training Guarantees Generalization" in non-convex settings. This is the first algorithmdependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.

show abstract

An Efficient Sampling Algorithm for Non-smooth Composite Potentials

Mou¹,

Flammarion²,

Wainwright³

2019

Preprint

View full text Add to dashboard Cite

We consider the problem of sampling from a density of the form p(x) ∝ exp(−f (x) − g(x)), where f : R d → R is a smooth and strongly convex function and g : R d → R is a convex and Lipschitz function. We propose a new algorithm based on the Metropolis-Hastings framework, and prove that it mixes to within TV distance ε of the target density in at most O(d log(d/ε)) iterations. This guarantee extends previous results on sampling from distributions with smooth log densities (g = 0) to the more general composite non-smooth case, with the same mixing time up to a multiple of the condition number. Our method is based on a novel proximal-based proposal distribution that can be efficiently computed for a large class of non-smooth functions g.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wenlong Mou

Efficient Private ERM for Smooth Objectives

Efficient Private ERM for Smooth Objectives

Improved Bounds for Discretization of Langevin Diffusions: Near-Optimal Rates without Convexity

Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints

An Efficient Sampling Algorithm for Non-smooth Composite Potentials

Contact Info

Product

Resources

About