Recovery schemes for high availability and high performance distributed real-time computing

Lundberg, Lars; Häggander, Daniel; Klonowska, Kamilla; Svahnberg, Charlie

doi:10.1109/ipdps.2003.1213241

Cited by 4 publications

(3 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this case the recovery schemes are implemented in the external systems as second, third, fourth,... alternative destinations (alternative cluster nodes) in case the primary, secondary,... destination goes down); • there is one network address for each computer and we use IP takeover (or similar techniques); • all the work performed by a computer is done by one process or a group of related processes that share local resources and thus must be moved as one unit. The results presented here can be easily generalized to the case with a number of independent processes on each computer using the same technique as in some of our previous papers [9,10].…”

Section: Problem Formulationmentioning

confidence: 88%

See 1 more Smart Citation

Extended Golomb Rulers as the New Recovery Schemes in Distributed Dependable Computing

Klonowska

Lundberg

Lennerstad

et al.

19th IEEE International Parallel and Distributed Processing Symposium

Self Cite

View full text Add to dashboard Cite

Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers break down the load on these computers must be redistributed to other computers in the cluster.The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e. we want to optimize the worst-case behavior.We have previously defined recovery schemes that are optimal for some limited cases. In this paper we find a new recovery schemes that are based on so called Golomb rulers. They are optimal for a much larger number of cases than the previous results.

show abstract

Section: Problem Formulationmentioning

confidence: 88%

“…Another algorithm, called Greedy, is presented in [9]. This algorithm generates the recovery schemes that give optimality for a larger number of cases than the Log algorithm, (i.e.…”

Section: Previous Researchmentioning

confidence: 99%

Extended Golomb Rulers as the New Recovery Schemes in Distributed Dependable Computing

Klonowska

Lundberg

Lennerstad

et al.

19th IEEE International Parallel and Distributed Processing Symposium

Self Cite

View full text Add to dashboard Cite

show abstract

“…In our system, however, we do not have the (non-volatile) memory capacity and the computational resources for a checkpointing approach. In general purpose distributed computing it is common to use redundant hardware and employ load sharing techniques to increase fault-tolerance (see, e.g., [5]). …”

Section: Related Workmentioning

confidence: 99%

Improving fault-tolerance in intelligent video surveillance by monitoring, diagnosis and dynamic reconfiguration

Doblander

Maier

Rinner

et al.

Third International Workshop on Intelligent Solutions in Embedded Systems, 2005.

View full text Add to dashboard Cite

In this paper, we present an approach for improving fault-tolerance and service availability in intelligent video surveillance (IVS) systems. A typical IVS system consists of various intelligent video sensors that combine image sensing with video analysis and network streaming. System monitoring and fault diagnosis followed by appropriate dynamic system reconfiguration mitigate effects of faults and therefore enhance the system's fault-tolerance. The applied monitoring and diagnosis unit (MDU) allows the detection of both node-and system-level faults. Lacking redundant hardware such reconfigurations are established by graceful degradation of the overall application. An optimizer module that performs multi-criterion optimization is used to compute a new degraded system configuration by trading off quality of service (QoS), energy consumption, and service availability. We demonstrate the functionality of our approach by an illustrative example.

show abstract

Using Optimal Golomb Rulers for Minimizing Collisions in Closed Hashing

Lundberg

Lennerstad

Klonowska

et al. 2004

Advances in Computer Science - ASIAN 2004. Higher-Level Decision Making

View full text Add to dashboard Cite

Recovery schemes for high availability and high performance distributed real-time computing

Cited by 4 publications

References 3 publications

Extended Golomb Rulers as the New Recovery Schemes in Distributed Dependable Computing

Extended Golomb Rulers as the New Recovery Schemes in Distributed Dependable Computing

Improving fault-tolerance in intelligent video surveillance by monitoring, diagnosis and dynamic reconfiguration

Using Optimal Golomb Rulers for Minimizing Collisions in Closed Hashing

Contact Info

Product

Resources

About