Many IP multicast applications, for example, near real-time dissemination of financial, information, require high availability. This problem has not received much attention so far. In this paper [1], we investigate a complete multicast routing architecture consisting of IGMP [2] for multicast group membership management in a LAN, OSPF [3] for unicast routing, and PIM sparse-mode [4] and PIM dense-mode [5] for multicast routing. We restrict our attention to the single link and router faults inside the network, and assume that sending and receiving hosts, their LANs are reliable. Since fault recovery associated with rendezvous point (RP) faihires in PIM SM have been studied extensively [4], this paper focuses on other mechanisms (router, link, LAN, WAN fail-over) that are not sufficiently addressed and are less well understood by the community. Analytical models are presented to describe the interplay of all of the component protocols in various multicast channel recovery scenarios. Quantitative results for the recovery time of IP multicast channels are given as references for network configurations, and protocol development. Simulation models are developed using the OPNET simulation tool to measure the fault recovery time and the associated protocol control overhead, and study the influence of important protocol parameter s. A testbed with five Cisco routers is configured with PIM, OSPE and IGMP to measure the multicast channel failure and recovery time for a variety of different link and router failures. In general, the failure recovery is found to be light-weight in terms of control overhead and recovery time. Failure recovery time in a WAN is found to be dominated by the unicast protocol recovery process. Failure recovery in a LAN is more complex, and strongly influenced by protocol interactions and implementation specifics.
N E T W O R K F A I L U R E S C E N A R I O SWhen network element failure occurs in a network, IGMP, OSPF, and PIM asynchronously interact to recover a multicast channel. The analysis of PIM SM is restricted to shared trees (not shortest path trees) and thus does not address failure during the migration period of shared tree to shortest path tree. PIM SM and DM recover from network element failures in a similar manner. However, for recovering the part of the multicast channel upstream of a router, a router running PIM SM will send a Join message to its Reverse Permission to make digital or hard copies of all or part of this work for personal or classroom use is 9ranted without fee provided that copies are not made or distributed for profit or commercial advant -age and that copies bear this notice and the full citation on the first page.To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.Path Forwarding (RPF) ~ router, while a router running PIM DM will send a Graft message. From herein, "PIM" is used to refer to both the DM and SM cases, unless otherwise specified. Single-fault network failures can be classified i...