Since several decades, fault tolerance has become a major research field, due to transistor shrinking and core number increasing in System-on-Chip (SoC). Especially, faults occurring at the Network-on-Chips (NoCs) of those systems have a significant impact, since NoCs are the key component of on-chip communication. Several fault tolerant approaches have been proposed, which are, however, limited against multiple permanent faults. To reduce the impact of these faults on the data communications, we propose a bit-shuffling method for fault tolerant NoCs. The proposed approach exploits, at runtime, the position of the permanent faults and changes the order of bits inside a flit. Our bit-shuffling method reduces as much as possible the fault impact, by transferring the faults from Most Significant Bits (MSBs) towards Least Significant Bits (LSBs). With this technique, we show that, in presence of multiple permanent faults, the Mean Square Error (MSE) on the payload transmission is reduce from 10 17 to 10 5 under three permanent fault for 32-bit unsigned integers. This technique also ensures the correct transmission of headers under multiple permanent faults.
Due to transistor shrinking and core number increasing in System-on-Chip (SoC), fault tolerance has become essential. Indeed, faults occurring to Network-on-Chips (NoCs) of those systems have a significant impact, due to the high amount of data crossing the NoC for communication. However, existing fault correction approaches cannot efficiently address several permanent faults on NoC, due to their high hardware costs. To mitigate the impact of faults, existing works shuffle the bits inside a flit, transferring the impact of faults on the least significant bits. However, such approaches are applied at a fine-grained level, providing fault mitigation efficiency but with significant hardware costs. To address this limitation, this work proposes a region-based bit-shuffling technique, applied at a coarse-grain level, that trades off fault mitigation efficiency in order to save hardware costs. The obtained results show that the area and power overheads can be reduced from 48% to 33% and from 34% to 22%, respectively, with a small impact on the Mean Square Error (MSE).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.