Proceedings 20th IEEE International Parallel &Amp; Distributed Processing Symposium 2006
DOI: 10.1109/ipdps.2006.1639293
|View full text |Cite
|
Sign up to set email alerts
|

Load balancing in the presence of random node failure and recovery

Abstract: In many distributed computing systems that are prone to either induced or spontaneous node failures, the number of available computing resources is dynamically changing in a random fashion. A load-balancing (LB) policy for such systems should therefore be robust, in terms of workload re-allocation and effectiveness in task completion, with respect to the random absence and re-emergence of nodes as well as random delays in the transfer of workloads among nodes. In this paper two LB policies for such computing e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2008
2008
2012
2012

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(22 citation statements)
references
References 11 publications
0
22
0
Order By: Relevance
“…However, the available literature on distributed computing in such uncertain environments primarily considers reactive techniques, where a node failure is addressed only after its occurrence [7]. One of the few exceptions is the paper of Dhakal et al [12] that presents two preemptive load-balancing policies for a heterogeneous distributed computing system with wireless links between nodes. Preemptiveness in this case implies adjusting actions to compensate for the possibility of node failure/recovery.…”
Section: Related Workmentioning
confidence: 99%
“…However, the available literature on distributed computing in such uncertain environments primarily considers reactive techniques, where a node failure is addressed only after its occurrence [7]. One of the few exceptions is the paper of Dhakal et al [12] that presents two preemptive load-balancing policies for a heterogeneous distributed computing system with wireless links between nodes. Preemptiveness in this case implies adjusting actions to compensate for the possibility of node failure/recovery.…”
Section: Related Workmentioning
confidence: 99%
“…Regarding the transfer times, our assumptions are justified according to our prior work [9], [10], [16] and the empirical data obtained from the experiments conducted over the DC architecture to be discussed in Section 3. In addition, we have assumed that the mean transfer time of the ith group of tasks being transferred to the kth node follows the first-order approximation:…”
Section: Assumption A2 (Independence Of the Random Times)mentioning
confidence: 99%
“…In general, however, the above partitions p jk may not be effective and must be adjusted in order to compensate for the effects of the random transfer times. The load to be migrated from the jth to the kth must be adjusted according to what is called the load-balancing gain [9], [10], [16], [22], which is denoted as K jk , yielding…”
Section: Distributed Load-balancing Policymentioning
confidence: 99%
See 2 more Smart Citations