SC16: International Conference for High Performance Computing, Networking, Storage and Analysis 2016
DOI: 10.1109/sc.2016.25
|View full text |Cite
|
Sign up to set email alerts
|

SERF: Efficient Scheduling for Fast Deep Neural Network Serving via Judicious Parallelism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(9 citation statements)
references
References 28 publications
0
9
0
Order By: Relevance
“…Note that each GPU server executes only one DL task at a time. Referring to [23], we assume that both the power consumption and the DL performance are linearly proportional to the number of GPU devices, as defined in ( 2) and (3). β i j,c pkq, β i j,m pkq, β i j,e pkq denote iteration time model coefficients at time k respectively, similar to α i j,c pkq, α i j,m pkq, α i j,e pkq [13].…”
Section: B Frequency Scaling Based Power and Performance Modelmentioning
confidence: 99%
“…Note that each GPU server executes only one DL task at a time. Referring to [23], we assume that both the power consumption and the DL performance are linearly proportional to the number of GPU devices, as defined in ( 2) and (3). β i j,c pkq, β i j,m pkq, β i j,e pkq denote iteration time model coefficients at time k respectively, similar to α i j,c pkq, α i j,m pkq, α i j,e pkq [13].…”
Section: B Frequency Scaling Based Power and Performance Modelmentioning
confidence: 99%
“…The primary objective of distributed machine learning is to minimise the time required to execute computing tasks [111]. While parallelisation and distributed architectures increase the available computing resources, applying them naively can harm, rather than improve system performance [171]. For model training, the system optimization, resource allocation and scheduling challenges essentially are concerned with determining how to partition a model, where to place model parts and when to train which part of the model [110] in the shortest possible time, fully utilising available computing resources.…”
Section: System Optimizationmentioning
confidence: 99%
“…Automatically mapping tasks to hardware resources, scheduling and balancing workloads and determining the task execution order is thus important. Recent efforts have investigated adaptive, dynamic load balancing [97], optimal resource allocation and dynamic scheduling [171], and automated, dependence-aware scheduling [168] to improve training speed and system response to varying loads. Metaoptimizations that can be automated to improve model and system performance are parameter search, hyper-parameter search and neural architecture search [17].…”
Section: System Optimizationmentioning
confidence: 99%
“…The outcome gets attracted by Cosmetatos' calculations 14 where it calculates m / d int erf / n using the m / m / n with adjustment and exact things. These calculations methods are maddened to resolve the interference‐aware scenario and resolve m / d int erf / n within two methods, ie, (1) solving m / d int erf / n queue has interference‐aware exponential service time.…”
Section: Proposed Heterogeneous Hybridized Fuzzy‐based Dijkstra's Schmentioning
confidence: 99%
“…Section 3 describes about the problem statement describing about the major issues, which are considered in these tasks. Section 4 explains regarding the explained HHFDS methodology that hybridizes a fuzzy Dijstra's algorithm with deep neural network 14,15 . Section 5 provides comparative analysis to depict with increased production of the explained algorithm.…”
Section: Introductionmentioning
confidence: 99%