In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an (α 1 + ≤ 7.081 + )-approximation algorithm for k-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [16]. For k-means with outliers, we give an (α 2 + ≤ 53.002 + )-approximation, which is the first O(1)-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give α 1 -and (α 1 + )approximation algorithms for matroid and knapsack median problems respectively, improving upon the previous best approximations ratios of 8 [42] and 17.46 [9].The natural LP relaxation for the k-median/k-means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almost-integral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almost-integral and fully-integral solutions. Then, using a pre-processing procedure, we show how to convert an almost-integral solution to a fully-integral solution losing only a constant-factor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any > 0. * Microsoft Research India. for general and Euclidean metrics respectively. Both problems admit PTASs [2, 21, 19] on fixed-dimensional Euclidean metrics.Despite their simplicity and elegance, a significant shortcoming these formulations face in real-world data sets is that they are not robust to noisy points, i.e., a few outliers can completely change the cost as well as structure of solutions. To overcome this shortcoming, Charikar et al. [12] introduced the robust k-median (RkMed) problem (also called k-median with outliers), which we now define.Definition 1.1 (The Robust k-Median and k-Means Problems). The input to the Robust k-Median (RkMed) problem is a set C of clients, a set F of facility locations, a metric space (C ∪ F, d), integers k and m. The objective is to choose a subset S ⊆ F of cardinality at most k, and a subset C * ⊆ C of cardinality at least m such that the total cost j∈C * d(j, S) is minimized. In the Robust k-Means (RkMeans) problem, we have the same input and the goal is to minimize j∈C * d 2 (j, S).The problem is not just interesting from the clustering point of view. In fact, such a joint view of clustering and removing outliers has been observed to be more effective [15,39] for even the sole task of outlier detection, a very important problem in the real world. Due to these use cases, there has been much recent work [15,23,40] in the applied community on these problems. However, their inherent complexity from the theoretical side is much less understood. For RkMed, Charikar et al. [12] give an algorithm that violates the number of outliers by a factor of (1 + ), and has cost at most 4(1 + 1/ ) times the optimal cost. Chen [1...
In this paper, we introduce and study the Non-Uniform k-Center (NUkC) problem. Given a finite metric space (X, d) and a collection of balls of radii {r 1 ≥ • • • ≥ r k }, the NUkC problem is to find a placement of their centers on the metric space and find the minimum dilation α, such that the union of balls of radius α • r i around the ith center covers all the points in X. This problem naturally arises as a min-max vehicle routing problem with fleets of different speeds, or as a wireless router placement problem with routers of different powers/ranges. The NUkC problem generalizes the classic k-center problem when all the k radii are the same (which can be assumed to be 1 after scaling). It also generalizes the k-center with outliers (kCwO for short) problem when there are k balls of radius 1 and balls of radius 0. There are 2-approximation and 3-approximation algorithms known for these problems respectively; the former is best possible unless P=NP and the latter remains unimproved for 15 years. We first observe that no O(1)-approximation is to the optimal dilation is possible unless P=NP, implying that the NUkC problem is more non-trivial than the above two problems. Our main algorithmic result is an (O(1), O(1))-bi-criteria approximation result: we give an O(1)approximation to the optimal dilation, however, we may open Θ(1) centers of each radii. Our techniques also allow us to prove a simple (uni-criteria), optimal 2-approximation to the kCwO problem improving upon the long-standing 3-factor. Our main technical contribution is a connection between the NUkC problem and the so-called firefighter problems on trees which have been studied recently in the TCS community. We show NUkC is as hard as the firefighter problem. While we don't know if the converse is true, we are able to adapt ideas from recent works [4, 1] in non-trivial ways to obtain our constant factor bi-criteria approximation.
In this article we provide a constant factor approximation for the (p, 3)-flexible graph connectivity problem, improving on the previous best known O(p)-approximation.
In the stochastic knapsack problem, we are given a knapsack of size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, the only way to know the actual size and reward is to schedule the job-when it completes, we get to know these values. How should we schedule jobs to maximize the expected total reward? We know constant-factor approximations for this problem when we assume that rewards and sizes are independent random variables, and that we cannot prematurely cancel jobs after we schedule them. What can we say when either or both of these assumptions are changed?The stochastic knapsack problem is of interest in its own right, but techniques developed for it are applicable to other stochastic packing problems. Indeed, ideas for this problem have been useful for budgeted learning problems, where one is given several arms which evolve in a specified stochastic fashion with each pull, and the goal is to pull the arms a total of B times to maximize the reward obtained. Much recent work on this problem focus on the case when the evolution of the arms follows a martingale, i.e., when the expected reward from the future is the same as the reward at the current state. What can we say when the rewards do not form a martingale?In this paper, we give constant-factor approximation algorithms for the stochastic knapsack problem with correlations and/or cancellations, and also for budgeted learning problems where the martingale condition is not satisfied, using similar ideas. Indeed, we can show that previously proposed linear programming relaxations for these problems have large integrality gaps. We propose new time-indexed LP relaxations; using a decomposition and "gap-filling" approach, we convert these fractional solutions to distributions over strategies, and then use the LP values and the time ordering information from these strategies to devise a randomized adaptive scheduling algorithm. We hope our LP formulation and decomposition methods may provide a new way to address other correlated bandit problems with more general contexts. *
In this paper, we study the set cover problem in the fully dynamic model. In this model, the set of active elements, i.e., those that must be covered at any given time, can change due to element arrivals and departures. The goal is to maintain an algorithmic solution that is competitive with respect to the current optimal solution. This model is popular in both the dynamic algorithms and online algorithms communities. The difference is in the restriction placed on the algorithm: in dynamic algorithms, the running time of the algorithm making updates (called update time) is bounded, while in online algorithms, the number of updates made to the solution (called recourse) is limited.We give new results in both settings (all recourse and update time bounds are amortized):• In the update time setting, we obtain O(log n)-competitiveness with O(f log n) update time, and O(f 3 )-competitiveness with O(f 2 ) update time. The O(log n)-competitive algorithm is the first one to achieve a competitive ratio independent of f in this setting. The second result improves on previous work by removing an O(log n) factor in the update time bound. This has an important consequence: we obtain the first deterministic constant-competitive, constant update time algorithm for fully-dynamic vertex cover.• In the recourse setting, we show a competitive ratio of O(min{log n, f }) with constant recourse. The most relevant previous result is the O(log m log n) bound for online set cover in the insertion-only model with no recourse. Note that we can match the best offline bounds with O(1) recourse, something that is impossible in the classical online model.These results also yield, as corollaries, new results for the maximum k-coverage problem and the non-metric facility location problem in the fully dynamic model. Our results are based on two algorithmic frameworks in the fully-dynamic model that are inspired by the classic greedy and primal-dual algorithms for offline set cover. We show that both frameworks can be used for obtaining both recourse and update time bounds, thereby demonstrating algorithmic techniques common to these strands of research.
We reinterpret some online greedy algorithms for a class of nonlinear "load-balancing" problems as solving a mathematical program online. For example, we consider the problem of assigning jobs to (unrelated) machines to minimize the sum of the α th -powers of the loads plus assignment costs (the online Generalized Assignment Problem); or choosing paths to connect terminal pairs to minimize the α th -powers of the edge loads (i.e., online routing with speed-scalable routers). We give analyses of these online algorithms using the dual of the primal program as a lower bound for the optimal algorithm, much in the spirit of online primal-dual results for linear problems.We then observe that a wide class of uni-processor speed scaling problems (with essentially arbitrary scheduling objectives) can be viewed as such load balancing problems with linear assignment costs. This connection gives new algorithms for problems that had resisted solutions using the dominant potential function approaches used in the speed scaling literature, as well as alternate, cleaner proofs for other known results.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.