Proceedings of the 48th Design Automation Conference 2011
DOI: 10.1145/2024724.2024931
|View full text |Cite
|
Sign up to set email alerts
|

Enabling system-level modeling of variation-induced faults in networks-on-chips

Abstract: Process Variation (PV) is increasingly threatening the reliability of Networks-on-Chips. Thus, various resilient router designs have been recently proposed and evaluated. However, these evaluations assume random fault distributions, which result in 52%-81% inaccuracy. We propose an accurate circuit-level fault-modeling tool, which can be plugged into any system-level NoC simulator, quantify the system-level impact of PV-induced faults at runtime, pinpoint fault-prone router components that should be protected,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
17
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 34 publications
(17 citation statements)
references
References 20 publications
0
17
0
Order By: Relevance
“…Thereafter, network operation resumes normally. Ariadne leverages up*/down* routing 3 , a deadlock-free algorithm that can operate on any irregular topology [27]. Up*/down* requires each link to be assigned a direction: up or down.…”
Section: Ariadne Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…Thereafter, network operation resumes normally. Ariadne leverages up*/down* routing 3 , a deadlock-free algorithm that can operate on any irregular topology [27]. Up*/down* requires each link to be assigned a direction: up or down.…”
Section: Ariadne Algorithmmentioning
confidence: 99%
“…MOTIVATION Recent studies project that there will be many transistor failures during the lifetime of many-core chips fabricated at advanced technology nodes. Researchers have characterized the impact of technology scaling on device reliability in processors [28] and Networks-on-Chips (NoCs) [3], and indicate that the number of permanent failures is expected to increase. Borkar of Intel expects that at future technology nodes 20% of transistors in chip multiprocessors will be unusable due to variations of the manufacturing process, while an additional 10% of transistors will eventually fail over the lifetime of the chip due to wear-out [7,8].…”
Section: Introductionmentioning
confidence: 99%
“…System modeling can provide estimates for system parameters, like buffer and queue sizes [11]. As for circuit-52 Ahmed S. Hassan, et al: Clustered Networks-on-Chip: Simulation and Performance evaluation level simulation, technology parameters are taken into consideration, like critical path delays and temperature variation [12]. For early design space exploration, systemlevel simulators are more suitable.…”
Section: Introductionmentioning
confidence: 99%
“…However, recent work on fault modeling [3] has indicated that NoCs fabricated at advanced technology nodes become increasingly unreliable, causing a number of faults, such as loss or corruption of network messages. In order to mitigate this trend, resilient NoC designs have been proposed to allow correct operation in the face of faults in router hardware [9,12] and links [4,18].…”
Section: Introductionmentioning
confidence: 99%
“…However, no resilient NoC can guarantee 100% reliable data transfers (see Section 6). Thus, a subset of network faults is expected to be exposed to upper layers causing lost coherence messages [3], or corrupted coherence messages that are dropped at the destination node when the packet checksum is recomputed [3,7]. Unfortunately, the loss of a single coherence message can cause the entire system to deadlock.…”
Section: Introductionmentioning
confidence: 99%