Abstract-Many of today's signal processing tasks consider sparse models where the number of explanatory variables exceeds the sample size. When dealing with real-world data, the presence of impulsive noise and outliers must also be accounted for. Accurate and robust parameter estimation and consistent variable selection are needed simultaneously. Recently, some popular robust methods have been adapted to such complex settings. Especially, in high dimensional settings, however, it is possible to have a single contaminated predictor being responsible for many outliers. The amount of outliers introduced by this predictor easily exceeds the breakdown point of any existing robust estimator. Therefore, we propose a new robust and sparse estimator, the Outlier-Corrected-Data-(Adaptive) Lasso (OCD-(A) Lasso). It simultaneously handles highly contaminated predictors in the dataset and performs well under the classical contamination model. In a numerical study, it outperforms competing Lasso estimators, at a largely reduced computational complexity compared to its robust counterparts.
Active Queue Management (AQM) aims to prevent bufferbloat and serial drops in router and switch FIFO packet buffers that usually employ drop-tail queueing. AQM describes methods to send proactive feedback to TCP flow sources to regulate their rate using selective packet drops or markings. Traditionally, AQM policies relied on heuristics to approximately provide Quality of Service (QoS) such as a target delay for a given flow. These heuristics are usually based on simple network and TCP control models together with the monitored buffer filling. A primary drawback of these heuristics is that their way of accounting flow characteristics into the feedback mechanism and the corresponding effect on the state of congestion are not well understood. In this work, we show that taking a probabilistic model for the flow rates and the dequeueing pattern, a Semi-Markov Decision Process (SMDP) can be formulated to obtain an optimal packet dropping policy. This policy-based AQM, denoted PAQMAN, takes into account a steady-state model of TCP and a target delay for the flows. Additionally, we present an inference algorithm that builds on TCP congestion control in order to calibrate the model parameters governing underlying network conditions. Finally, we evaluate the performance of our approach using simulation compared to state-of-the-art AQM algorithms.
We study job assignment in large, heterogeneous resource-sharing clusters of servers with finite buffers. This load balancing problem arises naturally in today's communication and big data systems, such as Amazon Web Services, Network Service Function Chains, and Stream Processing. Arriving jobs are dispatched to a server, following a load balancing policy that optimizes a performance criterion such as job completion time. Our contribution is a randomized Cost-Based Scheduling (CBS) policy in which the job assignment is driven by general cost functions of the server queue lengths. Beyond existing schemes, such as the Join the Shortest Queue (JSQ), the power of d or the SQ(d) and the capacity-weighted JSQ, the notion of CBS yields new application-specific policies such as hybrid locally uniform JSQ. As today's data center clusters have thousands of servers, exact analysis of CBS policies is tedious. In this work, we derive a scaling limit when the number of servers grows large, facilitating a comparison of various CBS policies with respect to their transient as well as steady state behavior. A byproduct of our derivations is the relationship between the queue filling proportions and the server buffer sizes, which cannot be obtained from infinite buffer models. Finally, we provide extensive numerical evaluations and discuss several applications including multi-stage systems.
Recent advances in quality adaptation algorithms leave adaptive bitrate (ABR) streaming architectures at a crossroads: When determining the sustainable video quality one may either rely on the information gathered at the client vantage point or on server and network assistance. The fundamental problem here is to determine how valuable either information is for the adaptation decision. This problem becomes particularly hard in future Internet settings such as Named Data Networking (NDN) where the notion of a network connection does not exist.In this paper, we provide a fresh view on ABR quality adaptation for QoE maximization, which we formalize as a decision problem under uncertainty, and for which we contribute a sparse Bayesian contextual bandit algorithm denoted CBA. This allows taking high-dimensional streaming context information, including client-measured variables and network assistance, to find online the most valuable information for the quality adaptation. Since sparse Bayesian estimation is computationally expensive, we develop a fast new inference scheme to support online video adaptation. We perform an extensive evaluation of our adaptation algorithm in the particularly challenging setting of NDN, where we use an emulation testbed to demonstrate the efficacy of CBA compared to state-of-the-art algorithms.
We analyze a data-processing system with n clients producing jobs which are processed in batches by m parallel servers; the system throughput critically depends on the batch size and a corresponding sub-additive speedup function that arises due to overhead amortization. In practice, throughput optimization relies on numerical searches for the optimal batch size which is computationally cumbersome. In this paper, we model this system in terms of a closed queueing network assuming certain forms of service speedup; a standard Markovian analysis yields the optimal throughput in w n4 time. Our main contribution is a mean-field model that has a unique, globally attractive stationary point, derivable in closed form. This point characterizes the asymptotic throughput as a function of the batch size that can be calculated in O(1) time. Numerical settings from a large commercial system reveal that this asymptotic optimum is accurate in practical finite regimes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.