2008
DOI: 10.1109/ipdps.2008.4536349
|View full text |Cite
|
Sign up to set email alerts
|

Using hardware multithreading to overcome broadcast/reduction latency in an associative SIMD processor

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2013
2013
2013
2013

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…The key to AP hardware design is [53,54,55,52]: 1) high memory-to-PE bandwidth, and 2) low level synchronous operation supported by the i) elimination of branches in low level loops, and ii) the elimination of low level barrier synchronism. In the STARAN and the ASPRO, these abilities were supported by corner turning the data and assigning one record per processor, the multi-dimensional array memory (and flip network) and mask register hardware.…”
Section: Ap Propertiesmentioning
confidence: 99%
“…The key to AP hardware design is [53,54,55,52]: 1) high memory-to-PE bandwidth, and 2) low level synchronous operation supported by the i) elimination of branches in low level loops, and ii) the elimination of low level barrier synchronism. In the STARAN and the ASPRO, these abilities were supported by corner turning the data and assigning one record per processor, the multi-dimensional array memory (and flip network) and mask register hardware.…”
Section: Ap Propertiesmentioning
confidence: 99%