Sampling profilers are popular because of their low and adjustable overhead and because they do not distort the profile by modifying the application code. A typical sampling profiler periodically suspends the application threads, walks their stacks, and merges the resulting stack traces into a calling context tree. Java virtual machines offer a convenient interface to accomplish this, but rely on safepoints, a synchronization mechanism that requires all threads to park in a safe location. However, a profiler is primarily interested in the running threads, and waiting for all threads to reach a safe location significantly increases the overhead. In most cases, taking a complete stack trace is also unnecessary because many stack frames remain unchanged between samples.We present three techniques that reduce the overhead of sampling Java applications. Partial safepoints require only a certain number of threads to enter a safepoint and can be used to sample only the running threads. With self-sampling, we parallelize taking stack traces by having each thread take its own stack trace. Finally, incremental stack tracing constructs stack traces lazily and examines each stack frame only once instead of walking the entire stack for each sample. Our techniques require no support from the operating system or hardware. With our implementation in the popular HotSpot virtual machine, we show that we can significantly reduce the overhead of sampling without affecting the accuracy of the profiles.
Concurrent programming can lead to considerable performance gains on modern, parallel hardware when compared to single-core systems. When dealing with parallelization, however, developers have to explicitly address synchronization to safely access shared resources, which is typically achieved with locks. Choosing between simpler but less scalable and more sophisticated but error-prone locking mechanisms is difficult during development. Therefore, lock contention analysis at run time is crucial to aid in such decisions. I would also like to extend my thanks to the Christian Doppler Forschungsgesellschaft and to Dynatrace Austria for funding this work. Finally, I want to thank my friends and especially my family, who supported me during my studies at the Johannes Kepler University. I could not have accomplished all this without you. Thank you very much! Contents vii There are already various lock contention profilers for different languages and scenarios (cf. Chapter 6), but most of them do not capture all necessary information to actually fix any potential performance bottlenecks (e.g., where contention was caused). Those that do cannot universally be applied but rather are restricted to specific environments (e.g., bound to a specific virtual machine implementation) which may be problematic for the use in production systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.