2005 IEEE Hot Chips XVII Symposium (HCS) 2005
DOI: 10.1109/hotchips.2005.7476577
|View full text |Cite
|
Sign up to set email alerts
|

Cell broadband engine interconnect and memory interface

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2006
2006
2013
2013

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 0 publications
0
9
0
Order By: Relevance
“…The correlation coefficient was calculated to be 0.899. One source of difference, as highlighted by the data for CP which actually achieves a normalized IPC over 1, is likely due to compiler optimizations in ptxas which may reduce the instruction count on real hardware 7 . Overall, the data shows that applications that perform well in real GPU hardware perform well in our simulator and applications that perform poorly in real GPU hardware also perform poorly in our simulator.…”
Section: (28 Shader Cores X 8-wide Pipelines)mentioning
confidence: 99%
See 2 more Smart Citations
“…The correlation coefficient was calculated to be 0.899. One source of difference, as highlighted by the data for CP which actually achieves a normalized IPC over 1, is likely due to compiler optimizations in ptxas which may reduce the instruction count on real hardware 7 . Overall, the data shows that applications that perform well in real GPU hardware perform well in our simulator and applications that perform poorly in real GPU hardware also perform poorly in our simulator.…”
Section: (28 Shader Cores X 8-wide Pipelines)mentioning
confidence: 99%
“…Branch divergence was highlighted by Fung et al as a major source of performance loss for multithreaded SIMD 7. We only simulate the input PTX code which, in CUDA, ptxas then assembles into a proprietary binary format that we are unable to simulate.…”
Section: Branch Divergencementioning
confidence: 99%
See 1 more Smart Citation
“…The Tilera TILE-Gx processors [44] and Intel Polaris [16] are examples of real packet-switched NoC implementations with up to 100 and 80 cores, respectively. The Cell BE [9] uses a circuit-switched network to connect heterogeneous cores and a single memory controller.…”
Section: Related Workmentioning
confidence: 99%
“…The First-Generation CELL processor consists of the PPE and its L2 cache, eight SPEs [2] each with its own local memory (LS) [3], a high bandwidth internal Element Interconnect Bus (EIB) [4], two configurable non-coherent I/O interfaces, a Memory Interface Controller (MIC), and a Pervasive unit that supports extensive test, monitoring, and debug functions. The high level chip diagram is shown in figure 1 below.…”
Section: Introductionmentioning
confidence: 99%