2009 IEEE International Symposium on Workload Characterization (IISWC) 2009
DOI: 10.1109/iiswc.2009.5306793
|View full text |Cite
|
Sign up to set email alerts
|

Understanding PARSEC performance on contemporary CMPs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
31
1
1

Year Published

2010
2010
2019
2019

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 59 publications
(34 citation statements)
references
References 16 publications
1
31
1
1
Order By: Relevance
“…We observed that this particular benchmark had very poor scaling when run on a large number of cores and that parallel performance gains were only realized when using a very large input. Previous work on evaluating the performance of PARSEC note that for Streamcluster, "95% of compute cycles are spent finding the Euclidean distance between two points" and that "Scaling is sensitive to memory speeds and bus contention" [12]. However, our analysis reveals that the performance issue for Streamcluster does not stem from where compute cycles are being spent, but rather where they are not being spent.…”
Section: Inefficient Barrier Implementationcontrasting
confidence: 49%
“…We observed that this particular benchmark had very poor scaling when run on a large number of cores and that parallel performance gains were only realized when using a very large input. Previous work on evaluating the performance of PARSEC note that for Streamcluster, "95% of compute cycles are spent finding the Euclidean distance between two points" and that "Scaling is sensitive to memory speeds and bus contention" [12]. However, our analysis reveals that the performance issue for Streamcluster does not stem from where compute cycles are being spent, but rather where they are not being spent.…”
Section: Inefficient Barrier Implementationcontrasting
confidence: 49%
“…These benchmarks have more regular memory access patterns, which gives them relatively good cache locality. Similar trends in behaviour for compute-bound and memorybound benchmarks on simultaneous multi-threaded (SMT) multicore architectures has been observed for the PARSEC benchmark suite [6].…”
Section: Scalability Studysupporting
confidence: 48%
“…(Since we are interested in pushing the number of threads to hundreds, we leave out benchmarks from the kit that either have very limited scalability, or that cannot be spawned with hundreds of threads [7] [14].) We chose the PARSEC kit because it represents emerging workloads, specifically modeling future CMP applications [15].…”
Section: B Workload and System Parametersmentioning
confidence: 99%