Maria Florina Balcan scite author profile

We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators.We consider both the malicious noise model of Valiant [Valiant 1985;Kearns and Li 1988] and the adversarial label noise model of Kearns, Schapire, and Sellie [1994]. For malicious noise, where the adversary can corrupt both the label and the features, we provide a polynomial-time algorithm for learning linear separators in ℜ d under isotropic log-concave distributions that can tolerate a nearly information-theoretically optimal noise rate of η = Ω(ǫ), improving on the Ω ǫ 3 log 2 (d/ǫ) noise-tolerance of [Klivans et al. 2009a]. In the case that the distribution is uniform over the unit ball, this improves on the Ω ǫ d 1/4 noise-tolerance of [Kalai et al. 2005] and the Ω ǫ 2 log(d/ǫ) of [Klivans et al. 2009a].For the adversarial label noise model, where the distribution over the feature vectors is unchanged, and the overall probability of a noisy label is constrained to be at most η, we also give a polynomial-time algorithm for learning linear separators in ℜ d under isotropic log-concave distributions that can handle a noise rate of η = Ω (ǫ). In the case of the uniform distribution, this improves over the results of [Kalai et al. 2005] which either required runtime super-exponential in 1/ǫ (ours is polynomial in 1/ǫ) or tolerated less noise. 1 Our algorithms are also efficient in the active learning setting, where learning algorithms only receive the classifications of examples when they ask for them. We show that, in this model, our algorithms achieve a label complexity whose dependence on the error parameter ǫ is polylogarithmic (and thus exponentially better than that of any passive algorithm). This provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise.Our algorithms and analysis combine several ingredients including aggressive localization, minimization of a progressively rescaled hinge loss, and a novel localized and soft outlier removal procedure. We use localization techniques (previously used for obtaining better sample complexity results) in order to obtain better noise-tolerant polynomial-time algorithms.

show abstract

Clustering under Perturbation Resilience

Balcan

Liang

2012

View full text Add to dashboard Cite

Abstract. Motivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [13] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. The hope is that by exploiting the structure in such instances, one can overcome worst case hardness results.In this paper, we provide several results within this framework. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to perturbations of factor (1 + √ 2), solving an open problem of Awasthi et al. [3]. For k-median, a center-based objective of special interest, we additionally give algorithms for a more relaxed assumption in which we allow the optimal solution to change in a small ǫ fraction of the points after perturbation. We give the first bounds known for k-median under this more realistic and more general assumption. We also provide positive results for min-sum clustering which is typically a harder objective than center-based objectives from approximability standpoint. Our algorithms are based on new linkage criteria that may be of independent interest.Additionally, we give sublinear-time algorithms, showing algorithms that can return an implicit clustering from only access to a small random sample.Key words. clustering, perturbation resilience, k-median clustering, min-sum clustering AMS subject classifications. 68Q25, 68Q32, 68T05, 68W25, 68W401. Introduction. Problems of clustering data from pairwise distance information are ubiquitous in science. A common approach for solving such problems is to view the data points as nodes in a weighted graph (with the weights based on the given pairwise information), and then to design algorithms to optimize various objective functions such as k-median or min-sum. For example, in the k-median clustering problem the goal is to partition the data into k clusters C i , giving each a center c i , in order to minimize the sum of the distances of all data points to the centers of their cluster. In the min-sum clustering approach the goal is to find k clusters C i that minimize the sum of all intra-cluster pairwise distances. Yet unfortunately, for most natural clustering objectives, finding the optimal solution to the objective function is NP-hard. As a consequence, there has been substantial work on approximation algorithms [18,14,9,15,1] with both upper and lower bounds on the approximability of these objective functions on worst case instances.Recently, Bilu and Linial [13] suggested an exciting, alternative approach aimed at understanding the complexity of clustering instances which arise in practice. Motivated by the fact that distances between data points in clustering instances are often based on a heuristic measure, they argue that interesting instances should be resilient to small perturbations in these distances. In particular, if small pertur...

show abstract

Combining Self Training and Active Learning for Video Segmentation

Fathi

Balcan

Ren

et al. 2011

View full text Add to dashboard Cite

This work addresses the problem of segmenting an object of interest out of a video. We show that video object segmentation can be naturally cast as a semi-supervised learning problem and be efficiently solved using harmonic functions. We propose an incremental self-training approach by iteratively labeling the least uncertain frame and updating similarity metrics. Our self-training video segmentation produces superior results both qualitatively and quantitatively. Moreover, usage of harmonic functions naturally supports interactive segmentation. We suggest active learning methods for providing guidance to user on what to annotate in order to improve labeling efficiency. We present experimental results using a ground truth data set and a quantitative comparison to a representative object segmentation system.

show abstract

Communication Efficient Distributed Kernel Principal Component Analysis

Balcan

Liang

Song

et al. 2016

View full text Add to dashboard Cite

Kernel Principal Component Analysis (KPCA) is a key machine learning algorithm for extracting nonlinear features from data. In the presence of a large volume of high dimensional data collected in a distributed fashion, it becomes very costly to communicate all of this data to a single data center and then perform kernel PCA. Can we perform kernel PCA on the entire dataset in a distributed and communication efficient fashion while maintaining provable and strong guarantees in solution quality?In this paper, we give an affirmative answer to the question by developing a communication efficient algorithm to perform kernel PCA in the distributed setting. The algorithm is a clever combination of subspace embedding and adaptive sampling techniques, and we show that the algorithm can take as input an arbitrary configuration of distributed datasets, and compute a set of global kernel principal components with relative error guarantees independent of the dimension of the feature space or the total number of data points. In particular, computing k principal components with relative error over s workers has communication costÕ(sρk/ + sk 2 / 3 ) words, where ρ is the average number of nonzero entries in each data point. Furthermore, we experimented the algorithm with large-scale real world datasets and showed that the algorithm produces a high quality kernel PCA solution while using significantly less communication than alternative approaches.

show abstract

On the Equilibria of Alternating Move Games

Roth¹,

Balcan²,

Kalai³

et al. 2010

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.