2020
DOI: 10.14778/3407790.3407796
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic parameter allocation in parameter servers

Abstract: To keep up with increasing dataset sizes and model complexity, distributed training has become a necessity for large machine learning tasks. Parameter servers ease the implementation of distributed parameter management---a key concern in distributed training---, but can induce severe communication overhead. To reduce communication overhead, distributed machine learning algorithms use techniques to increase parameter access locality (PAL), achieving up to linear speed-ups. We found that existing parameter serve… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 13 publications
(21 citation statements)
references
References 40 publications
0
13
0
Order By: Relevance
“…In contrast to static full replication, a classic PS uses network bandwidth only when parameters are actually accessed. However, a classic PS is often inefficient due to access latency [41,42]. Figure 1 depicts the performance of both approaches for a task of training large-scale knowledge graph embeddings.…”
Section: Model Qualitymentioning
confidence: 99%
See 2 more Smart Citations
“…In contrast to static full replication, a classic PS uses network bandwidth only when parameters are actually accessed. However, a classic PS is often inefficient due to access latency [41,42]. Figure 1 depicts the performance of both approaches for a task of training large-scale knowledge graph embeddings.…”
Section: Model Qualitymentioning
confidence: 99%
“…For example, the Petuum PS [12,16] selectively replicates parameters on specific nodes when the nodes access these parameters. The Lapse PS [42] dynamically relocates parameters among nodes to hide access latency. Multi-technique PSs [41,57] combine different parameter management techniques (e.g., replication and relocation) and pick a suitable one for each parameter.…”
Section: Model Qualitymentioning
confidence: 99%
See 1 more Smart Citation
“…Each worker executes the training algorithm over its local partition, and synchronizes with other workers from time to time. A typical implementation of data parallelism is parameter server [2,29,30,45,50,63,84]. Another popular implementation is message passing interface (MPI) [38], e.g., the AllReduce MPI primitive leveraged by MLlib [72], XGBoost [27], PyTorch [64], etc [60].…”
Section: Related Workmentioning
confidence: 99%
“…Also, we utilize a modified distributed scheme to speed it up. Parsa [15] proposes a distributed partition algorithm to reduce the communication overhead. As it aims at memory-resident PS, Parsa does not take disk I/O cost into account, which cannot be neglected in our DRPS.…”
Section: Parameter Index and Partitionmentioning
confidence: 99%