2010 IEEE/ACM Int'l Conference on Green Computing and Communications &Amp; Int'l Conference on Cyber, Physical and Social Compu 2010
DOI: 10.1109/greencom-cpscom.2010.133
|View full text |Cite
|
Sign up to set email alerts
|

Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models

Abstract: The insatiable demand of high performance computing is being driven by the most computationally intensive applications such as computational chemistry, climate modeling, nuclear physics, etc. The last couple of decades have observed a tremendous rise in supercomputers with architectures ranging from traditional clusters to system-on-a-chip in order to achieve the petaflop computing barrier. However, with advent of petaflopplus computing, we have ushered in an era where power efficient system software stack is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 19 publications
(5 citation statements)
references
References 37 publications
0
5
0
Order By: Relevance
“…The approaches that target the Message Passing Interface (MPI) applications mainly involve mitigation of workload imbalance between the process (slack) [4,16,29,41]. Other MPI-centric solutions address cases where the processor cores wait on the memory or network [5,22,26,43,48,49]. Concurrency throttling has been widely used by adapting the thread count in OpenMP programs that are memory-constrained to reduce power consumption [12,13,31,37].…”
Section: Related Workmentioning
confidence: 99%
“…The approaches that target the Message Passing Interface (MPI) applications mainly involve mitigation of workload imbalance between the process (slack) [4,16,29,41]. Other MPI-centric solutions address cases where the processor cores wait on the memory or network [5,22,26,43,48,49]. Concurrency throttling has been widely used by adapting the thread count in OpenMP programs that are memory-constrained to reduce power consumption [12,13,31,37].…”
Section: Related Workmentioning
confidence: 99%
“…The more sophisticated ones scale processor frequency on different intervals of application runtime while attempting to predict accurately the performance effects from the DVFS. Such approaches may be broadly classified into two types: One that first divides the application into execution intervals of predefined duration and then uses the performance counters to determine a suitable frequency for them [7,10,11]; and the other that first determines communication intervals in parallel applications that use either explicit message passing [6,15,22,23] or global address-space primitives [24] and then scales the frequency for those intervals, usually based on the variation of the MIPS (million instructions per second) metric at different P-states. Typically these approaches first choose a (often user-defined) performance loss (PL) tolerance for the application and then try to maximize energy savings under this PL as constraint.…”
Section: Employing Dvfsmentioning
confidence: 99%
“…The other approaches primarily focus on scaling processor frequency during slack or communication operations during application runtime. The techniques in the past have targeted communication intervals in parallel applications that use either explicit message passing Lowenthal 2005, Lim, Freeh, andLowenthal 2006) or global address-space primitives (Vishnu, Song, Marquez, Barker, Kerbyson, Cameron, and Balaji 2010) and then scales the frequency for those intervals. Oversubscribing the processor cores (Iancu, Hofmeyr, Blagojevic, and Zheng 2010) is another technique which can be used to reduce execution time and lower power consumption of a parallel application.…”
Section: Related Workmentioning
confidence: 99%