We consider load balancing in service systems with affinity relations between jobs and servers. Specifically, an arriving job can be allocated to a fast, primary server from a particular selection associated with this job or to a secondary server to be processed at a slower rate. Such job-server affinity relations can model network topologies based on geographical proximity, or data locality in cloud scenarios. We introduce load balancing schemes that allocate jobs to primary servers if available, and otherwise to secondary servers. A novel coupling construction is developed to obtain stability conditions and performance bounds using a coupling technique. We also conduct a fluid limit analysis for symmetric model instances, which reveals a delicate interplay between the model parameters and load balancing performance.
Modern service systems, like cloud computing platforms or data center environments, commonly face a high degree of heterogeneity. This heterogeneity is not only caused by different server speeds but also, by binding task-server relations that must be taken into account when assigning incoming tasks. Unfortunately, there are hardly any theoretical performance guarantees as these systems do not fall within the typical supermarket modeling framework which heavily relies on strong symmetry and homogeneity assumptions. In “Heavy-traffic universality of redundancy systems with assignment constraints,” Cardinaels, Borst, and van Leeuwaarden provide insight in the performance of these systems operating under redundancy scheduling policies. Surprisingly, when experiencing high demand, these systems exhibit state space collapse and can achieve a similar level of resource pooling and performance as a fully flexible system, even subject to quite strict task-server constraints.
In classical power-of-two load balancing any server pair is sampled with equal probability. This does not cover practical settings with assignment constraints which force non-uniform server sampling. While intuition suggests that non-uniform sampling adversely impacts performance, this was only supported through simulations, and rigorous statements have remained elusive. Building on product-form distributions for redundancy systems, we prove the stochastic dominance of uniform sampling for a four-server system as well as arbitrary-size systems in light traffic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.