Bhattacharjee, Robi scite author profile

Bhattacharjee, Robi

5Publications

3Citation Statements Received

35Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

No-substitution k-means Clustering with Adversarial Order

Robi¹,

Moshkovitz²

2020

Preprint

View full text Add to dashboard Cite

We investigate k-means clustering in the online no-substitution setting when the input arrives in arbitrary order. In this setting, points arrive one after another, and the algorithm is required to instantly decide whether to take the current point as a center before observing the next point. Decisions are irrevocable. The goal is to minimize both the number of centers and the k-means cost. Previous works in this setting assume that the input's order is random, or that the input's aspect ratio is bounded. It is known that if the order is arbitrary and there is no assumption on the input, then any algorithm must take all points as centers. Moreover, assuming a bounded aspect ratio is too restrictive -it does not include natural input generated from mixture models.We introduce a new complexity measure that quantifies the difficulty of clustering a dataset arriving in arbitrary order. We design a new random algorithm and prove that if applied on data with complexity d, the algorithm takes O(d log(n)k log(k)) centers and is an O(k 3 )-approximation. We also prove that if the data is sampled from a "natural" distribution, such as a mixture of k Gaussians, then the new complexity measure is equal to O(k 2 log(n)). This implies that for data generated from those distributions, our new algorithm takes only poly(k log(n)) centers and is a poly(k)approximation. In terms of negative results, we prove that the number of centers needed to achieve an α-approximation is at least Ω d k log(nα) .

show abstract

An Exploration of Multicalibration Uniform Convergence Bounds

Rosenberg¹,

Robi²,

Fawaz³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent works have investigated the sample complexity necessary for fair machine learning. The most advanced of such sample complexity bounds are developed by analyzing multicalibration uniform convergence for a given predictor class. We present a framework which yields multicalibration error uniform convergence bounds by reparametrizing sample complexities for Empirical Risk Minimization (ERM) learning. From this framework, we demonstrate that multicalibration error exhibits dependence on the classifier architecture as well as the underlying data distribution. We perform an experimental evaluation to investigate the behavior of multicalibration error for different families of classifiers. We compare the results of this evaluation to multicalibration error concentration bounds. Our investigation provides additional perspective on both algorithmic fairness and multicalibration error convergence bounds. Given the prevalence of ERM sample complexity bounds, our proposed framework enables machine learning practitioners to easily understand the convergence behavior of multicalibration error for a myriad of classifier architectures.

show abstract

Sample Complexity of Adversarially Robust Linear Classification on Separated Data

Robi¹,

Jha²,

Chaudhuri³

2020

Preprint

View full text Add to dashboard Cite

We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least Ω( d n ), whereas the max margin algorithm has expected standard loss O( 1 n ). This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is O( 1 n ). This shows that for very well-separated data, convergence rates of O( 1 n ) are achievable, which is not the case otherwise. Our results apply to robustness measured in any p norm with p > 1 (including p = ∞).

show abstract

Consistent Non-Parametric Methods for Maximizing Robustness

Robi¹,

Chaudhuri²

2021

Preprint

View full text Add to dashboard Cite

Online $k$-means Clustering on Arbitrary Data Streams

Robi¹,

Imola²,

Moshkovitz³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.