In a Fork-Join (FJ) queueing system an upstream fork station splits incoming jobs into N tasks to be further processed by N parallel servers, each with its own queue; the response time of one job is determined, at a downstream join station, by the maximum of the corresponding tasks' response times. This queueing system is useful to the modelling of multiservice systems subject to synchronization constraints, such as MapReduce clusters or multipath routing. Despite their apparent simplicity, FJ systems are hard to analyze.This paper provides the first computable stochastic bounds on the waiting and response time distributions in FJ systems. We consider four practical scenarios by combining 1a) renewal and 1b) non-renewal arrivals, and 2a) non-blocking and 2b) blocking servers. In the case of non-blocking servers we prove that delays scale as O(log N ), a law which is known for first moments under renewal input only. In the case of blocking servers, we prove that the same factor of log N dictates the stability region of the system. Simulation results indicate that our bounds are tight, especially at high utilizations, in all four scenarios. A remarkable insight gained from our results is that, at moderate to high utilizations, multipath routing "makes sense" from a queueing perspective for two paths only, i.e., response times drop the most when N = 2; the technical explanation is that the resequencing (delay) price starts to quickly dominate the tempting gain due to multipath transmissions.
A simple bound in GI/G/1 queues was obtained by Kingman using a discrete martingale transform [5]. We extend this technique to 1) multiclass (GI/G/1 queues and 2) Markov Additive Processes (MAPs) whose background processes can be time-inhomogeneous or have an uncountable state-space. Both extensions are facilitated by a necessary and sufficient ordinary differential equation (ODE) condition for MAPs to admit continuous martingale transforms. Simulations show that the bounds on waiting time distributions are almost exact in heavy-traffic, including the cases of 1) heterogeneous input, e.g., mixing Weibull and Erlang-k classes and 2) Generalized Markovian Arrival Processes, a new class extending the Batch Markovian Arrival Processes to continuous batch sizes.
We derive simple bounds on the queue distribution in finite-buffer queues with Markovian arrivals. Our technique relies on a subtle equivalence between tail events and stopping times orderings. The bounds capture a truncated exponential behavior, involving joint horizontal and vertical shifts of an exponential function; this is fundamentally different than existing results capturing horizontal shifts only. Using the same technique, we obtain similar bounds on the loss distribution, which is a key metric to understand the impact of finite-buffer queues on real-time applications. Simulations show that the bounds are accurate in heavy-traffic regimes, and improve existing ones by orders of magnitude. In the limiting regime with utilization ρ=1 and iid arrivals, the bounds on the queue size distribution are insensitive to the arrivals distribution.
The practicality of the stochastic network calculus (SNC) is often questioned on grounds of potential looseness of its performance bounds. In this paper it is uncovered that for bursty arrival processes (specifically Markov-Modulated On-Off (MMOO)), whose amenability to per-flow analysis is typically proclaimed as a highlight of SNC, the bounds can unfortunately indeed be very loose (e.g., by several orders of magnitude off). In response to this uncovered weakness of SNC, the (Standard) per-flow bounds are herein improved by deriving a general sample-path bound, using martingale based techniques, which accommodates FIFO, SP, and EDF scheduling disciplines. The obtained (Martingale) bounds capture an additional exponential decay factor of O e −αn in the number of flows n, and are remarkably accurate even in multiplexing scenarios with few flows.
Task replication has recently been advocated as a practical solution to reduce latencies in parallel systems. In addition to several convincing empirical studies, analytical results have been provided, yet under some strong assumptions such as independent service times of the replicas, which may lend themselves to some contrasting and perhaps contriving behavior. For instance, under the independence assumption, an overloaded system can be stabilized by a replication factor, but can be sent back in overload through further replication. Motivated by the need to dispense with such common and restricting assumptions, which may cause unexpected behavior, we develop a unified and general theoretical framework to compute tight bounds on the distribution of response times in general replication systems. These results immediately lend themselves to the optimal number of replicas minimizing response time quantiles, depending on the parameters of the system (e.g., the degree of correlation amongst replicas).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.