2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2017
DOI: 10.1109/iccad.2017.8203781
|View full text |Cite
|
Sign up to set email alerts
|

A load balancing inspired optimization framework for exascale multicore systems: A complex networks approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
35
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 37 publications
(35 citation statements)
references
References 26 publications
0
35
0
Order By: Relevance
“…Mapping side-by-side threads on the same cores minimizes the overall communication between cores. Another approach [27] which seeks to obtain the same, minimizes inter-core communication overhead by splitting the threads into clusters such that the amount of inter-communication between clusters is minimized, while the number of clusters must not exceed the number of cores. Authors [27] say it that mapping clusters of threads to cores instead of mapping individual threads to cores is more efficient because it is easier to minimize the amount of communication between clusters instead of threads.…”
Section: Comparison Of the Numa-btlp Algorithm And Other Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Mapping side-by-side threads on the same cores minimizes the overall communication between cores. Another approach [27] which seeks to obtain the same, minimizes inter-core communication overhead by splitting the threads into clusters such that the amount of inter-communication between clusters is minimized, while the number of clusters must not exceed the number of cores. Authors [27] say it that mapping clusters of threads to cores instead of mapping individual threads to cores is more efficient because it is easier to minimize the amount of communication between clusters instead of threads.…”
Section: Comparison Of the Numa-btlp Algorithm And Other Workmentioning
confidence: 99%
“…The speedup obtained by mapping clusters of threads instead of threads, using the algorithm in [27], varies from 10.2% to 131.82% when compared to speedup obtained when mapping individual threads. However, when the threads are independent, mapping individual threads, as happens with NUMA-BTLP [5], is 10% more efficient in terms of execution time than mapping clusters, as happens with the algorithm in [27].…”
Section: Comparison Of the Numa-btlp Algorithm And Other Workmentioning
confidence: 99%
See 3 more Smart Citations