Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2022
DOI: 10.1145/3503221.3508418
|View full text |Cite
|
Sign up to set email alerts
|

FasterMoE

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…On top of that, we test the overall throughput and the speedup of TA-MoE over these two classical baselines. To be more comprehensive, we also compare with the recently proposed FasterMoE Hir gate [9] on the metric of time to convergence performance. Besides, a detailed analysis of communication costs, as well as the distribution of the dispatch are also given.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…On top of that, we test the overall throughput and the speedup of TA-MoE over these two classical baselines. To be more comprehensive, we also compare with the recently proposed FasterMoE Hir gate [9] on the metric of time to convergence performance. Besides, a detailed analysis of communication costs, as well as the distribution of the dispatch are also given.…”
Section: Discussionmentioning
confidence: 99%
“…To be more comprehensive, we make further comparisons with the recently proposed FasterMoE [9]. Because the compulsory dispatch strategy of FasterMoE affects the convergence, we take the validation loss w.r.t time as the comparison metric.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations