2014
DOI: 10.1109/l-ca.2012.34
|View full text |Cite
|
Sign up to set email alerts
|

Generalized MultiAmdahl: Optimization of Heterogeneous Multi-Accelerator SoC

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
6
2

Relationship

4
4

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 4 publications
0
13
0
Order By: Relevance
“…The objective function representing the average memory delay, yielded by the best of three possible configurations, under variety of constraint resources is obtained using KKT multipliers similarly to [4]:…”
Section: Optimizing Cache Hierarchymentioning
confidence: 99%
“…The objective function representing the average memory delay, yielded by the best of three possible configurations, under variety of constraint resources is obtained using KKT multipliers similarly to [4]:…”
Section: Optimizing Cache Hierarchymentioning
confidence: 99%
“…PiM and SIMD taxonomy. to account for the "uncore" components, concluding that to sustain the scalability of future many-core systems, the uncore components must be designed to scale sublinearly with respect to the overall core count. Morad et al [2013 presented several frameworks that, given (1) a multicore architecture consisting of last-level cache (LLC), processing cores, and an NoC interconnecting the cores and the LLC; (b) workloads consisting of sequential and concurrent tasks; and (c) physical resource constraints (area, power, execution time, off-chip bandwidth), find the optimal selection of a subset of the available processing cores and the optimal resource allocation among all blocks.…”
Section: Related Workmentioning
confidence: 99%
“…Morad et al [7] [8] proposed models that minimized sequential and concurrent execution time of heterogeneous and asymmetric SoC processing cores. The limitations of the frameworks presented in [38], [8] and [7] are: (a) modeling the processing cores, but not addressing common building blocks such as NoC and LLC; (a) modeling workloads containing either a sequence of sequential heterogeneous tasks [38] [8], or modeling workloads containing a sequence of concurrent sections [5], but not both together; (c) utilizing Lagrange multipliers thus identifying the necessary condition for optimality, but not the optimal point; and (d) modeling constrained area [38] [8], or constrained area/power designs [5], but not addressing off-chip bandwidth; and (e) solving for optimal execution time under area/power constraints, but not addressing optimization of power or area under constraints.…”
Section: Related Workmentioning
confidence: 99%
“…Further, we asswne that tasks runtime depends only on core's speedup function at its designated area, power (in a similar manner to [14][3][37] [22][38] [8]). Our model, however, does account for microarchitecture differences as each core may have its own area-and power-to-performance model.…”
Section: A Workloadmentioning
confidence: 99%