Proceedings of Workshops of HPC Asia 2018
DOI: 10.1145/3176364.3176373
|View full text |Cite
|
Sign up to set email alerts
|

Scaling collectives on large clusters using Intel(R) architecture processors and fabric

Abstract: This paper provides results on scaling Barrier and Allreduce to 8192 nodes on a cluster of Intel ® Xeon Phi™ processors installed at the University of Tokyo and the University of Tsukuba. We will describe the effects of OS and platform noise on the performance of these collectives, and provide ways to minimize the noise as well as isolate it to specific cores. We will provide results showing that Barrier and Allreduce scale well when noise is reduced. We were able to achieve a latency of 94 usec (7.1x speedup … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 4 publications
0
2
0
Order By: Relevance
“…The single node deployment is offered via faasd where resource limits are not available. The recommended deployment option is again K8s where resource limits for memory and CPU can be set independently 100 . This feature is based on K8s facilities to restrict resources within the deployment YAML adapted by OpenFaaS.…”
Section: Resource Scaling Strategiesmentioning
confidence: 99%
See 1 more Smart Citation
“…The single node deployment is offered via faasd where resource limits are not available. The recommended deployment option is again K8s where resource limits for memory and CPU can be set independently 100 . This feature is based on K8s facilities to restrict resources within the deployment YAML adapted by OpenFaaS.…”
Section: Resource Scaling Strategiesmentioning
confidence: 99%
“…[37], wanted to be more flexible in changing the frequency by hand during their benchmarks. Since some of the generic scaling governors are overwritten (by using the same name) by intel_pstate and others are not usable, sometimes researchers [63,82,100,140,236] disabled it to use the generic ones. None of the aforementioned papers specified one of the first three options (1-3) explicitly in their papers.…”
Section: Related Workmentioning
confidence: 99%