2018 IEEE High Performance Extreme Computing Conference (HPEC) 2018
DOI: 10.1109/hpec.2018.8547571
|View full text |Cite
|
Sign up to set email alerts
|

Designing Algorithms for the EMU Migrating-threads-based Architecture

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 15 publications
0
4
0
Order By: Relevance
“…Other recent work has also looked to extend from low-level characterizations like those presented here by providing initial Emu-focused implementations of Breadth-First Search [11], Jaccard index computation [42], bitonic sort, [66] and compiler optimizations like loop fusion, edge flipping, and remote updates to reduce migrations [16].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Other recent work has also looked to extend from low-level characterizations like those presented here by providing initial Emu-focused implementations of Breadth-First Search [11], Jaccard index computation [42], bitonic sort, [66] and compiler optimizations like loop fusion, edge flipping, and remote updates to reduce migrations [16].…”
Section: Related Workmentioning
confidence: 99%
“…Previous work has investigated the initial Emu architecture design [23], algorithmic designs for merge and radix sorts on the Emu hardware [51], and baseline performance characteristics of the Emu Chick hardware [11,36]. This investigation is focused on determining how irregular algorithms perform on the prototype Chick hardware and how we implement specific algorithms so that they can scale to a rack-scale Emu and beyond.…”
Section: Introductionmentioning
confidence: 99%
“…Simulations of architectures focusing on near-data processing [14] including in-memory [15] and near-memory [16] show great promise for increasing performance while also drastically reducing energy usage. Other than our previous study [1], and related work on characterizing the Emu by other research groups [17,18] few of these architectures have been implemented in hardware, even FPGAs, limiting the data scales on which applications can be evaluated.…”
Section: Related Workmentioning
confidence: 99%
“…Other recent work has also looked to extend from low-level characterizations like those presented here by providing initial Emu-focused implementations of Breadth-First Search [17], Jaccard index computation [27], bitonic sort, [28] and compiler optimizations like loop fusion, edge flipping, and remote updates to reduce migrations [29].…”
Section: Related Workmentioning
confidence: 99%