A 2D torus network is one of the most popular networks for parallel processing. We have researched the North-South First (NSF) routing which is applicable to a 2D torus and combines the north-first (NF) and south-first (SF) methods. We focused on the proposal of a routing algorithm aimed at avoiding congestion of the crowded network. It was superior in congestion tolerance but not in fault tolerance. We have therefore been researching algorithms considering fault tolerance of the NSF method. In this paper we propose an NSF-FT method which is a new routing algorithm with improved fault tolerance. We evaluated the congestion resistance and fault tolerance of the proposed method by dynamic communication performance evaluation by simulation. The software simulation showed that the proposed algorithm has higher performance.A number of adaptive routing algorithms based on the turn model [14-17] do not need additional virtual channels. However, most of these algorithms cannot be applied to torus networks without change. If an adaptive routing algorithm for a torus network could be realized by modifying the turn model, it would be possible to realize adaptive routing without having to install additional virtual channels [18].We have previously proposed the North-South First (NSF) algorithm, which is the combination of North First and South First algorithms [19][20][21][22][23]. Since up to now we focused on the proposal of a routing algorithm aimed at avoiding congestion of the interconnection network, the fault tolerance of the NSF algorithm was not sufficient.We improved the NSF algorithms and propose an improved North-South First method (NSF-IP, NSF-ImProved), which is a fault-tolerant routing algorithm. And we evaluated both its congestion resistance and its fault tolerance in dynamic communication performance by simulation [23]. However, the fault tolerance of NSF-IP is not enough [23]. It is thought that the fault tolerance can be by changing the routing policy when a packet arrives at a faulty PE.In this paper we propose the NSF-FT (NSF-Fault-Tolerant) algorithm to improve the fault tolerance by improving the NSF-IP algorithm. We evaluate its congestion resistance and fault tolerance by software simulation.
2D Torus NetworkThe structure of a 2D torus network is shown in Figure 1. The network has an N×N two-dimensional structure, and its four edges are connected by wraparound links. It is used in many parallel computers and some interconnection networks. Figure 1: Structure of a 2D-torus network.Dimension-order routing (DOR) is generally used for deterministic routing on a 2D torus. In DOR, the packet moves on channels in the y direction before moving in the x direction. To avoid deadlocks on a 2D torus, DOR needs two virtual channels (channel-L and channel-H). Choose channel-L when starting routing in the y direction. When the head of the packet passes through a wraparound link, move the packet to channel-H. When the routing in the y direction is completed, move the packet in the x direction; use channel-L r...