Diffusion approximations for load balancing mechanisms in cloud storage systems

Budhiraja, Amarjit; Friedlander, Eric

doi:10.1017/apr.2019.3

Cited by 9 publications

(6 citation statements)

References 34 publications

(80 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is noteworthy that the scaled occupancy process loses its diffusive behavior for fixed d. It is further shown in [11] that with high probability the steady-state fraction of queues with length at least log d (N/η(N )) − ω(1) tasks approaches unity, which in turn implies that with high probability the steadystate delay is at least log d (N/η(N )) − O(1) as N → ∞. The diffusion approximation of the JSQ(d) policy in the Halfin-Whitt regime (2.1), starting from a different initial scaling, has been studied by Budhiraja & Friedlander [8]. Recently, Ying [47] introduced a broad framework involving Stein's method to analyze the rate of convergence of the scaled steady-state occupancy process of the JSQ(2) policy when η(N ) = N α with α > 0.8.…”

Section: )mentioning

confidence: 99%

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

Boor

Borst

Leeuwaarden

et al. 2019

Proceedings of the International Congress of Mathematicians (ICM 2018)

View full text Add to dashboard Cite

We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an introduction to the basic load balancing scenario -referred to as the supermarket model -consisting of a single dispatcher where tasks arrive that must immediately be forwarded to one of N single-server queues. The supermarket model is a dynamic counterpart of the classical balls-and-bins setup where balls must be sequentially distributed across bins.A popular class of load balancing algorithms are so-called power-of-d or JSQ(d) policies, where an incoming task is assigned to a server with the shortest queue among d servers selected uniformly at random. As the name reflects, this class includes the celebrated Join-the-Shortest-Queue (JSQ) policy as a special case (d = N ), which has strong stochastic optimality properties and yields a mean waiting time that vanishes as N grows large for any fixed subcritical load. However, a nominal implementation of the JSQ policy involves a prohibitive communication burden in large-scale deployments. In contrast, a simple random assignment policy (d = 1) does not entail any communication overhead, but the mean waiting time remains constant as N grows large for any fixed positive load.In order to examine the fundamental trade-off between delay performance and implementation overhead, we consider an asymptotic regime where the diversity parameter d(N ) depends on N . We investigate what growth rate of d(N ) is required to match the optimal performance of the JSQ policy on fluid and diffusion scale, and achieve a vanishing waiting time in the limit. The results demonstrate that the asymptotics for the JSQ(d(N )) policy are insensitive to the exact growth rate of d(N ), as long as the latter is sufficiently fast, implying that the optimality of the JSQ policy can asymptotically be preserved while dramatically reducing the communication overhead.Stochastic coupling techniques play an instrumental role in establishing the asymptotic optimality and universality properties, and augmentations of the coupling constructions allow these properties to be extended to infinite-server settings and network scenarios. We additionally show how the communication overhead can be reduced yet further by the so-called Join-the-Idle-Queue (JIQ) scheme, leveraging memory at the dispatcher to keep track of idle servers.In the present paper we review scalable load balancing algorithms (LBAs) which achieve excellent delay performance in large-scale systems and yet only involve low implementation overhead. LBAs play a critical role in distributing service requests or tasks (e.g. compute jobs, data base look-ups, file transfers) among servers or distributed resources in parallel-processing systems. The analysis and design of LBAs has attracted strong attention in recent years, mainly spurred by crucial scalability challenges arising in cloud networks and data centers with massive...

show abstract

Section: )mentioning

confidence: 99%

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

Boor

Borst

Leeuwaarden

et al. 2019

Proceedings of the International Congress of Mathematicians (ICM 2018)

View full text Add to dashboard Cite

show abstract

“…This feature has implications on the choice of the Hilbert space L 2 (w) chosen to formulate the SDE (10) and, additionally, several estimates have to be derived to control the stochastic fluctuations of the first coordinate. This situation is different from the examples of the Ginzburg-Landau model in [7,28,34], since each site only have interactions with a finite number of sites, or in the stochastic network example [6] where the interaction range is also finite. It should be also noted that our evolution equations are driven by a set of independent Poisson processes whose intensity is state dependent which is not the case in the Ginzburg-Landau models for which the diffusion coefficients are constant.…”

Section: Introductionmentioning

confidence: 81%

“…For more results on related fluctuation problems in statistical mechanics, see Spohn [28,29]. Fluctuations of an infinite dimensional Markov process associated to load balancing mechanisms in large stochastic networks have been investigated in Graham [13], Budhiraja and Friedlander [6].…”

Section: Introductionmentioning

confidence: 99%

A Functional Central Limit Theorem for the Becker–Döring Model

Sun

2018

J Stat Phys

View full text Add to dashboard Cite

We investigate the fluctuations of the stochastic Becker-Döring model of polymerization when the initial size of the system converges to infinity. A functional central limit problem is proved for the vector of the number of polymers of a given size. It is shown that the stochastic process associated to fluctuations is converging to the strong solution of an infinite dimensional stochastic differential equation (SDE) in a Hilbert space. We also prove that, at equilibrium, the solution of this SDE is a Gaussian process. The proofs are based on a specific representation of the evolution equations, the introduction of a convenient Hilbert space and several technical estimates to control the fluctuations, especially of the first coordinate which interacts with all components of the infinite dimensional vector representing the state of the process.

show abstract

“…The influential works of Mitzenmacher [23,24] and Vvedenskaya et al [29] showed by considering a fluid scaling that increasing d from 1 to 2 leads to significant improvement in performance in terms of steadystate queue length distributions in that the tails of the asymptotic steady-state distributions decay exponentially when d = 1 and super-exponentially when d = 2. Limit theorems under a diffusion scaling for the JSQ(d) system, with a fixed d, can be found in [5,7]. Although JSQ(d) for a fixed d ≥ 2 leads to significant improvements over JSQ (1), as observed in [10,11], no fixed value of d provides the optimal waiting time properties of the join-the-shortest-queue system (i.e.…”

Section: Introductionmentioning

confidence: 99%

Near Equilibrium Fluctuations for Supermarket Models with Growing Choices

Bhamidi¹,

Budhiraja²,

Dewaskar³

2020

Preprint

Self Cite

View full text Add to dashboard Cite

We consider the supermarket model in the usual Markovian setting where jobs arrive at rate nλn for some λn > 0, with n parallel servers each processing jobs in its queue at rate 1. An arriving job joins the shortest among dn ≤ n randomly selected service queues. We show that when dn → ∞ and λn → λ ∈ (0, ∞), under natural conditions on the initial queues, the state occupancy process converges in probability, in a suitable path space, to the unique solution of an infinite system of constrained ordinary differential equations parametrized by λ. Our main interest is in the study of fluctuations of the state process about its near equilibrium state in the critical regime, namely when λn → 1. Previous papers e.g. [25] have considered the regime dn √ n log n → ∞ while the objective of the current work is to develop diffusion approximations for the state occupancy process that allow for all possible rates of growth of dn. In particular we consider the three canonical regimes (a)In all three regimes we show, by establishing suitable functional limit theorems, that (under conditions on λn) fluctuations of the state process about its near equilibrium are of order n −1/2 and are governed asymptotically by a one dimensional Brownian motion. The forms of the limit processes in the three regimes are quite different; in the first case we get a linear diffusion; in the second case we get a diffusion with an exponential drift; and in the third case we obtain a reflected diffusion in a half space. In the special case dn/( √ n log n) → ∞ our work gives alternative proofs for the universality results established in [25].

show abstract

Diffusion approximations for load balancing mechanisms in cloud storage systems

Cited by 9 publications

References 34 publications

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

A Functional Central Limit Theorem for the Becker–Döring Model

Near Equilibrium Fluctuations for Supermarket Models with Growing Choices

Contact Info

Product

Resources

About