Geoffrey Chinot scite author profile

We obtain estimation and excess risk bounds for Empirical Risk Minimizers (ERM) and minmax Median-Of-Means (MOM) estimators based on loss functions that are both Lipschitz and convex. Results for the ERM are derived under weak assumptions on the outputs and subgaussian assumptions on the design as in [2]. The difference with [2] is that the global Bernstein condition of this paper is relaxed here into a local assumption. We also obtain estimation and excess risk bounds for minmax MOM estimators under similar assumptions on the output and only moment assumptions on the design. Moreover, the dataset may also contains outliers in both inputs and outputs variables without deteriorating the performance of the minmax MOM estimators.Unlike alternatives based on MOM's principle [24,29], the analysis of minmax MOM estimators is not based on the small ball assumption (SBA) of [22]. In particular, the basic example of non parametric statistics where the learning class is the linear span of localized bases, that does not satisfy SBA [39] can now be handled. Finally, minmax MOM estimators are analysed in a setting where the local Bernstein condition is also dropped out. It is shown to achieve excess risk bounds with exponentially large probability under minimal assumptions insuring only the existence of all objects. * • The hinge loss defined, for any u ∈Ȳ = R and y ∈ Y = {−1, 1}, by¯ (u, y) = max(1 − uy, 0) satisfies Assumption 1 with L = 1.• The Huber loss defined, for anysatisfies Assumption 1 with L = δ.• The quantile loss is defined, for any τ. It satisfies Assumption 1 with L = 1. For τ = 1/2, the quantile loss is the L 1 loss.All along the paper, the following assumption is also granted.Assumption 2. The class F is convex.The empirical risk minimizers (ERM) [43] obtained by minimizing f ∈ F → R N (f ) are expected to be close to the oracle f * . This procedure and its regularized versions have been extensively studied in learning theory [20]. When the loss is both convex and Lipschitz, results have been obtained in practice [4,12] and theory [42]. Risk bounds with exponential deviation inequalities for the ERM can be obtained under weak assumptions on the outputs Y , but stronger assumptions on the design X. Moreover, fast rates of convergence [41] can only be obtained under margin type assumptions such as the Bernstein condition [8,42].The Lipschitz assumption and global Bernstein conditions (that hold over the entire F as in [2]) imply boundedness in L 2 -norm of the class F , see the discussion preceding Assumption 4 for details. This boundedness is not satisfied in linear regression with unbounded design so the results of [2] don't apply to this basic example such as linear regression with a Gaussian design. To bypass this restriction, the global condition is relaxed into a "local" one as in [15,42], see Assumption 4 below.The main constraint in our results on ERM is the assumption on the design. This constraint can be relaxed by considering alternative estimators based on the "median-of-means" (MOM) principle of [...

show abstract

Statistical learning with Lipschitz and convex loss functions

Chinot¹,

Lecué²,

Lerasle³

2018

Preprint

View full text Add to dashboard Cite

ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels

Chinot¹

2020

Electron. J. Statist.

View full text Add to dashboard Cite

We study Empirical Risk Minimizers (ERM) and Regularized Empirical Risk Minimizers (RERM) for regression problems with convex and L-Lipschitz loss functions. We consider a setting where |O| malicious outliers contaminate the labels. In that case, under a local Bernstein condition, we show that the L 2 -error rate is bounded by r N + AL|O|/N , where N is the total number of observations, r N is the L 2 -error rate in the noncontaminated setting and A is a parameter coming from the local Bernstein condition. When r N is minimax-rate-optimal in a non-contaminated setting, the rate r N +AL|O|/N is also minimax-rate-optimal when |O| outliers contaminate the label. The main results of the paper can be used for many non-regularized and regularized procedures under weak assumptions on the noise. We present results for Huber's M-estimators (without penalization or regularized by the 1 -norm) and for general regularized learning problems in reproducible kernel Hilbert spaces when the noise can be heavy-tailed.

show abstract

On the robustness of minimum norm interpolators and regularized empirical risk minimizers

2022

View full text Add to dashboard Cite

Robust high dimensional learning for Lipschitz and convex losses

Chinot¹,

Lecué²,

Lerasle³

2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Geoffrey Chinot

Robust statistical learning with Lipschitz and convex loss functions

Statistical learning with Lipschitz and convex loss functions

ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels

On the robustness of minimum norm interpolators and regularized empirical risk minimizers

Robust high dimensional learning for Lipschitz and convex losses

Contact Info

Product

Resources

About