We report on our efforts to formulate autonomic network repair as a reinforcement-learning problem. Our implemented system is able to learn to efficiently restore network connectivity after a failure.Our research explores a reinforcement-learning (Sutton & Barto 1998) formulation we call cost-sensitive fault remediation (CSFR), which was motivated by problems that arise in sequential decision making for diagnosis and repair. We have considered problems of web-server maintenance and disk-system replacement, and have fully implemented an experimental network-repair application.In cost-sensitive fault remediation, a decision maker is responsible for repairing a system when it breaks down. To narrow down the source of the fault, the decision maker can perform a test action at some cost, and to repair the fault it can carry out a repair action. A repair action incurs a cost and either restores the system to proper functioning or fails. In either case, the system informs the decision maker of the outcome. The decision maker seeks a minimum cost policy for restoring the system to proper functioning.We can find an optimal repair policy via dynamic programming. Let B be the power set of the set of fault states S, which is the set of belief states of the system. For each b ∈ B, define the expected value of action a in belief state s as the expected cost of the action plus the value of the resulting belief state:Here, b i is the belief state resulting from taking action a in belief state b and obtaining outcome i ∈ {0, 1}; it is the subset of b consistent with this outcome. If a is a repair action and i = 1, we define the future value V (b 1 ) = 0, as there is no additional cost incurred once a repair action is successful. In all other cases, the value of a belief state is the minimum action value taken over all available choices:The quantities Pr(b) and c(b, a) are the prior probability of a belief state and expected cost of an action, which can be computed easily from a CSFR specification of the problem. Table 1 illustrates a small CSFR example with two fault states, A and B. The planning process for this example begins with the belief state {A, B}. It considers the test actions DefaultGateway and PingIP and the repair actions FixIP, UseCachedIP, and RenewLease. It does not consider DnsLookup since the action neither provides information (always 0), nor has a non-zero chance of repair. In evaluating the action PingIP, the algorithm finds that outcome 0 has a probability of .25 and outcome 1 has a probability of .75. Its expected cost from belief state {A, B} is then .25(50 + cost({A})) + .75(250 + cost({B})). (1) The expected cost from belief state {A} is computed recursively. Since all test actions have an outcome with an estimated probability of 0, only repair actions are considered. Of these, RenewLease is chosen as the optimal action, with
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.