2012
DOI: 10.1145/2248487.2151001
|View full text |Cite
|
Sign up to set email alerts
|

Bottleneck identification and scheduling in multithreaded applications

Abstract: Performance of multithreaded applications is limited by a variety of bottlenecks, e.g. critical sections, barriers and slow pipeline stages. These bottlenecks serialize execution, waste valuable execution cycles, and limit scalability of applications. This paper proposes Bottleneck Identification and Scheduling in Multithreaded Applications (BIS), a cooperative software-hardware mechanism to identify and accelerate the most critical bottlenecks. BIS identifies which bottlenecks are likely to reduce performance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 39 publications
0
10
0
Order By: Relevance
“…T parallel is the parallel portion, which can be eaten away by spawning more threads. For online usage, T serial and T parallel can be accurately obtained by loop peeling method [12] that finishes executions repeatedly and learns the ratio of the serial portion, or, instrumentation technique [24] that inserts bottleneck-identification instructions at the entry and exit of serial and parallel portions to record elapsed cycles. As the distinctive part of α model, multithreading overhead T penalty indeed reveals the bottleneck of scale-out speedup [24].…”
Section: Scale-out Speedup: αmentioning
confidence: 99%
See 3 more Smart Citations
“…T parallel is the parallel portion, which can be eaten away by spawning more threads. For online usage, T serial and T parallel can be accurately obtained by loop peeling method [12] that finishes executions repeatedly and learns the ratio of the serial portion, or, instrumentation technique [24] that inserts bottleneck-identification instructions at the entry and exit of serial and parallel portions to record elapsed cycles. As the distinctive part of α model, multithreading overhead T penalty indeed reveals the bottleneck of scale-out speedup [24].…”
Section: Scale-out Speedup: αmentioning
confidence: 99%
“…For online usage, T serial and T parallel can be accurately obtained by loop peeling method [12] that finishes executions repeatedly and learns the ratio of the serial portion, or, instrumentation technique [24] that inserts bottleneck-identification instructions at the entry and exit of serial and parallel portions to record elapsed cycles. As the distinctive part of α model, multithreading overhead T penalty indeed reveals the bottleneck of scale-out speedup [24]. It is determined by 1) synchronization contentions, such as inter-thread locks and barriers, and 2) communication contentions, happened on communication-related hardware resources of LLC, memory controller, and memory bus etc.…”
Section: Scale-out Speedup: αmentioning
confidence: 99%
See 2 more Smart Citations
“…First, the scheduling unit in existing techniques is either interval based (fixed-instruction interval [70,72,73,78,81] or fixed-time interval [57,65,79,88,89]) or a code segment (e.g., critical sections, lagging threads, application kernels [54,68,69,87]). The scheduling unit in the event-based scheduling is the event handler in interactive mobile Web applications.…”
Section: Related Workmentioning
confidence: 99%