Proceedings of the Compilation of the Co-Located Workshops on DSM'11, TMC'11, AGERE! 2011, AOOPES'11, NEAT'11, &Amp; VMIL'11 2011
DOI: 10.1145/2095050.2095100
|View full text |Cite
|
Sign up to set email alerts
|

A microbenchmark case study and lessons learned

Abstract: The extra abstraction layer posed by the virtual machine, the JIT compilation cycles and the asynchronous garbage collection are the main reasons that make the benchmarking of Java code a delicate task. The primary weapon in battling these is replication: "billions and billions of runs", is phrase sometimes used by practitioners. This paper describes a case study, which consumed hundreds of hours of CPU time, and tries to characterize the inconsistencies in the results we encountered.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…The mean change rate between the three stoppage criteria and the static approach is~3% or lower for all three. Note that, following a rigorous measurement methodology,~3% could still be caused by JVM instabilities unrelated to our approach [19]. Again, RCIW is the best criterion with 1.4%±3.8%.…”
Section: Rqmentioning
confidence: 86%
See 2 more Smart Citations
“…The mean change rate between the three stoppage criteria and the static approach is~3% or lower for all three. Note that, following a rigorous measurement methodology,~3% could still be caused by JVM instabilities unrelated to our approach [19]. Again, RCIW is the best criterion with 1.4%±3.8%.…”
Section: Rqmentioning
confidence: 86%
“…Benchmarks often exhibit multiple steady states resulting in multi-modal distributions, and outliers, due to non-deterministic behavior, might still occur even after stable considered a fork to be in a steady state [19]. Therefore, our approach uses a fixed number of multiple measurement iterations mi (lines 8ś9), as a single measurement iteration would not accurately represent a fork's performance.…”
Section: Algorithm 1: Dynamic Reconfiguration Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…For each benchmark, it has the shortest reduction of performance at a frequency of 60 s except for the Arith benchmark in Figure 11. For the Arith, RIM4J imposes a 68% reduction at a frequency of 60 s, but a 60% reduction when at a frequency of 5 s. We attribute these small differences mostly to the instabilities of the JVM rather than the impact from our RIM4J [48].…”
Section: Performance Effect On Micro Benchmarksmentioning
confidence: 87%
“…There is much evidence that it is easy to unwittingly conduct performance experiments that produce tricky results [6,7,4,8,21,3,14]. On contemporary execution platforms, even small scale performance experiments are turning from an easy and reliable way of evaluating software performance into a difficult exercise of tracking a multitude of technical details that must be dealt with even in otherwise very simple scenarios.…”
Section: Introductionmentioning
confidence: 99%