2020
DOI: 10.48550/arxiv.2002.07386
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ResiliNet: Failure-Resilient Inference in Distributed Neural Networks

Abstract: When a neural network is partitioned and distributed across physical nodes, failure of physical nodes causes the failure of the neural units that are placed on those nodes, which results in a significant performance drop. Current approaches focus on resiliency of training in distributed neural networks. However, resiliency of inference in distributed neural networks is less explored. We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. Resili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 12 publications
(16 reference statements)
0
1
0
Order By: Relevance
“…Yousefpour et al [68,69] introduce the concept of skipping hyperconnections in distributed DNN, which provides a certain fault recovery capability for inference in distributed DNN. The concept of skipping hyperconnections is similar to skipping connections in the residual network.…”
Section: Failure-resilient Distributed Dnn Modelmentioning
confidence: 99%
“…Yousefpour et al [68,69] introduce the concept of skipping hyperconnections in distributed DNN, which provides a certain fault recovery capability for inference in distributed DNN. The concept of skipping hyperconnections is similar to skipping connections in the residual network.…”
Section: Failure-resilient Distributed Dnn Modelmentioning
confidence: 99%