2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) 2013
DOI: 10.1109/hpca.2013.6522319
|View full text |Cite
|
Sign up to set email alerts
|

Runnemede: An architecture for Ubiquitous High-Performance Computing

Abstract: DARPA's Ubiquitous High-Performance Computing (UHPC) program asked researchers to develop computing systems capable of achieving energy efficiencies of 50 GOPS/Watt, assuming 2018-era fabrication technologies. This paper describes Runnemede, the research architecture developed by the Intel-led UHPC team. Runnemede is being developed through a co-design process that considers the hardware, the runtime/OS, and applications simultaneously. Near-threshold voltage operation, fine-grained power and clock management,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0
2

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 77 publications
(53 citation statements)
references
References 26 publications
0
51
0
2
Order By: Relevance
“…Future and emerging many-core processors, such as Intel's Runnemeede [5], will provide communication pathways through distributed address spaces or shared address spaces, both on-chip and off-chip. The idea elaborated in this work is to use distributed address spaces in runtime system stages where cores share no application data and need to exchange only control messages for the purposes of scheduling and load balancing.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Future and emerging many-core processors, such as Intel's Runnemeede [5], will provide communication pathways through distributed address spaces or shared address spaces, both on-chip and off-chip. The idea elaborated in this work is to use distributed address spaces in runtime system stages where cores share no application data and need to exchange only control messages for the purposes of scheduling and load balancing.…”
Section: Discussionmentioning
confidence: 99%
“…However, processors designed for more specialized markets, such as high performance computing and large-scale data processing, use memory hierarchies without a coherence protocol. Graphics Processing Units (GPUs) [2], the Intel SCC [3] the Cell processor [4] and the experimental Runnemede prototype [5] are representative examples of non cache-coherent architectures. Programming a non-coherent architecture requires explicit communication between local address spaces, through message passing or Direct Memory Access (DMA).…”
Section: Introductionmentioning
confidence: 99%
“…This results in energy-inefficient designs. On-chip networks can already consume a substantial fraction of the on-chip power -potentially up to 30-40%, according to the literature [4,6,8,13,17,29]. Conservative future network designs, needed to tolerate parameter variations, may be unable to reduce the value of this fraction much.…”
Section: Introductionmentioning
confidence: 99%
“…The Rigel architecture [21] proposes having clusters of cores with L1 instruction caches and incoherent L2 caches (per cluster), together with a global shared L3 cache. Finally, the Runnemede architecture [8] also relies on a dataflow execution model to execute in a near-threshold computing environment, with multiple clusters of homogeneous cores and a hierarchy of local memories. In this architecture, coherence between clusters is fully managed in software.…”
Section: Runtime-aware Architecturesmentioning
confidence: 99%