Preventive maintenance of operational software systems, a novel technique for software fault tolerance, is used specifically to counteract the phenomenon of software "aging." However, it incurs some overhead. The necessity to do preventive maintenance, not only in general purpose software systems of mass use, but also in safety-critical and highly available systems, clearly indicates the need to follow an analysis based approach to determine the optimal times to perform preventive maintenance.In this paper, we present an analytical model of a software system which serves transactions. Due to aging, not only the service rate of the software decreases with time, but also the software itself experiences crash/hang failures which result in its unavailability. Two policies for preventive maintenance are modeled and expressions for resulting steady state availability, probability that an arriving transaction is lost and an upper bound on the expected response time of a transition are derived. Numerical examples are presented to illustrate the applicability of the models.
The representation of general distributions or measured data by phase-type distributions is an important and non-trivial task in analytical modeling. Although a large number of different methods for fitting parameters of phase-type distributions to data traces exist, many approaches lack efficiency and numerical stability. In this paper, a novel approach is presented that fits a restricted class of phase-type distributions, namely mixtures of Erlang distributions, to trace data.For the parameter fitting an algorithm of the expectation maximization type is developed. The paper shows that these choices result in a very efficient and numerically stable approach which yields phase-type approximations for a wide range of data traces that are as good or better than approximations computed with other less efficient and less stable fitting methods. To illustrate the effectiveness of the proposed fitting algorithm, we present comparative results for our approach and two other methods using six benchmark traces and two real traffic traces as well as quantitative results from queueing analysis.
Keywords:Performance and dependability assessment/analytical and numerical techniques, design of tools for performance/dependability assessment, traffic modeling, hyper-Erlang distributions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.