2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2017
DOI: 10.1109/ipdpsw.2017.36
|View full text |Cite
|
Sign up to set email alerts
|

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(14 citation statements)
references
References 11 publications
0
14
0
Order By: Relevance
“…There are several policies governing where data is homed. A common high-performance configuration [12], which is also the one we used in our study, is the quadrant mode. Quadrant mode means that the physical cores are divided into four logical parts, where each logical part is assigned two memory controllers; each logical group is treated as a unique Non-Uniform Memory-Access (NUMA) node, allowing the operating system to perform data-locality optimizations.…”
Section: A Hardware and Software Environmentmentioning
confidence: 99%
“…There are several policies governing where data is homed. A common high-performance configuration [12], which is also the one we used in our study, is the quadrant mode. Quadrant mode means that the physical cores are divided into four logical parts, where each logical part is assigned two memory controllers; each logical group is treated as a unique Non-Uniform Memory-Access (NUMA) node, allowing the operating system to perform data-locality optimizations.…”
Section: A Hardware and Software Environmentmentioning
confidence: 99%
“…Manycore machines, including KNL, are widely used for deep learning, as standalone devices or within clusters, e.g. [29], [30]. SVM training on multicore and manycore architectures was proposed by You et al [31].…”
Section: E Evaluation Of Quantized Representationmentioning
confidence: 99%
“…A heterogeneous system is composed of general purpose CPUs and specific purpose hardware accelerators, such as GPUs, Xeon Phi, FPGAs or TPUs. Under this concept, a wide range of systems are included, from powerful computing nodes capable of executing teraflops [2], to integrated CPU and GPU chips [3]. This architecture allows, not only to significantly increase the computing power, but also to improve their energy efficiency.…”
Section: Introductionmentioning
confidence: 99%