2015
DOI: 10.1109/tc.2014.2361526
|View full text |Cite
|
Sign up to set email alerts
|

Control-Flow Decoupling: An Approach for Timely, Non-Speculative Branching

Abstract: Mobile and PC/server class processor companies continue to roll out flagship core microarchitectures that are faster than their predecessors. Meanwhile placing more cores on a chip coupled with constant supply voltage puts per-core energy consumption at a premium. Hence, the challenge is to find future microarchitecture optimizations that not only increase performance but also conserve energy. Eliminating branch mispredictions-which waste both time and energy-is valuable in this respect. In this paper, we expl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 32 publications
0
7
0
Order By: Relevance
“…Further reducing mispredictions in software may be hard. However, previous hardware proposals [41] could help: since the loop trip count is generated outside of the loop body, the count could be communicated to the branch predictor in hardware, completely eliminating branch mispredictions.…”
Section: Mitigating Branch Misprediction Penaltymentioning
confidence: 99%
“…Further reducing mispredictions in software may be hard. However, previous hardware proposals [41] could help: since the loop trip count is generated outside of the loop body, the count could be communicated to the branch predictor in hardware, completely eliminating branch mispredictions.…”
Section: Mitigating Branch Misprediction Penaltymentioning
confidence: 99%
“…Speculative multithreading executes pre-computation slices [56] with architectural support to validate speculations, relies on ultra-light-weight threads to perform prefetching [13,18,61] or requires hardware communication channels between the prefetching and the main thread [49,53,58]. CFD [64] requires an architectural queue to efficiently communicate branch predicates that are loaded early in advance. Other proposals, most notably Multiscalar [24,66,74], combine software and hardware to enable instruction level parallelism using compiler-generated code structures, i.e., tasks, which can be executed simultaneously on multiple processing units.…”
Section: Related Workmentioning
confidence: 99%
“…Speculative multithreading executes pre-computation slices [56] with architectural support to validate speculations, relies on ultra-light-weight threads to perform prefetching [13,18,61] or requires hardware communication channels between the prefetching and the main thread [49,53,58]. CFD [64] requires an architectural queue to eiciently communicate branch predicates that are loaded early in advance. Other proposals, most notably Multiscalar [24,66,74], combine software and hardware to enable instruction level parallelism using compiler-generated code structures, i.e., tasks, which can be executed simultaneously on multiple processing units.…”
Section: Related Workmentioning
confidence: 99%