IEEE INFOCOM 2020 - IEEE Conference on Computer Communications 2020
DOI: 10.1109/infocom41043.2020.9155237
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading

Abstract: Deep neural networks (DNN) are the de-facto solution behind many intelligent applications of today, ranging from machine translation to autonomous driving. DNNs are accurate but resource-intensive, especially for embedded devices such as mobile phones and smart objects in the Internet of Things. To overcome the related resource constraints, DNN inference is generally offloaded to the edge or to the cloud. This is accomplished by partitioning the DNN and distributing computations at the two different ends. Howe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
70
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 134 publications
(70 citation statements)
references
References 34 publications
0
70
0
Order By: Relevance
“…This problem is similar to the problem that we have in the exchange communication in Appendix A-B with some exceptions. The Lagrangian is By following the same path with the exchange communication solution in Appendix A-B, we can find the optimal value of (p FRD k ) * = (E FRDRF k ) * /(t FRD k ) * as in (39) where by using (88) and complementary slackness conditions, the optimal value of (t FRD k ) * can be found as in (40).…”
Section: Solution Of Problem (75)mentioning
confidence: 99%
“…This problem is similar to the problem that we have in the exchange communication in Appendix A-B with some exceptions. The Lagrangian is By following the same path with the exchange communication solution in Appendix A-B, we can find the optimal value of (p FRD k ) * = (E FRDRF k ) * /(t FRD k ) * as in (39) where by using (88) and complementary slackness conditions, the optimal value of (t FRD k ) * can be found as in (40).…”
Section: Solution Of Problem (75)mentioning
confidence: 99%
“…Notwithstanding, along the continuum, Fog plays a role in addition to Edge and Cloud. This led to DINA [28], a fine-grained solution based on matching theory for dynamic DDNN partitioning in fog networks. Regardless, early exits, as BranchyNet [26] proposes, need to be considered since the inference early stops at middle layers reduce not only response time, but also network traffic [29] and computing capacity [30].…”
Section: Related Workmentioning
confidence: 99%
“…Recently, much effort focused on investigating DNN inference accelerations through task offloading in MEC environments. Mohammed et al [15] devised a novel DNN partitioning scheme in an MEC network, and applied the matching theory to distribute the DNN parts to edge servers, with the aim to minimize the total computation time. Xu et al [21] investigated the DNN inference offloading in an MEC network, assuming that each requested DNN has been partitioned.…”
Section: Related Workmentioning
confidence: 99%