Proceedings International Conference on Dependable Systems and Networks
DOI: 10.1109/dsn.2002.1029006
|View full text |Cite
|
Sign up to set email alerts
|

Reducing recovery time in a small recursively restartable system

Abstract: We present ideas on how to structure software systems for high availability by considering MTTR/MTTF

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 49 publications
(29 citation statements)
references
References 5 publications
(11 reference statements)
0
29
0
Order By: Relevance
“…Some approaches have focused on automatic restarting of components before or after they have failed [3], [4]. There have also been some works in applying techniques from Markov decision theory to dependability problems [5].…”
Section: Related Workmentioning
confidence: 99%
“…Some approaches have focused on automatic restarting of components before or after they have failed [3], [4]. There have also been some works in applying techniques from Markov decision theory to dependability problems [5].…”
Section: Related Workmentioning
confidence: 99%
“…More recently, two techniques have been presented to improve the performance compared to whole-system rebooting. The first technique is microrebooting [13], [14], which is a fine-grained technique for surgically recovering faulty application components without disturbing the rest of the application. Microrebooting is evaluated in an Internet auction system running on an application server.…”
Section: Implementation Of Watchdog Timers and Past-run Trace Recomentioning
confidence: 99%
“…Restarts of individual components for recovery, pro-actively as well as reactively have been seen in software rejuvenation [25] and recovery-oriented computing [13,4]. While the latter used fast restarts of individual components in a system as a tool to improve system availability, our work assumes a recovery time and maintains timely operation in the presence of restarts thus increasing the MTTF of the system aiding both reliability and availability.…”
Section: Related Workmentioning
confidence: 99%