2014
DOI: 10.1145/2666357.2597810
|View full text |Cite
|
Sign up to set email alerts
|

Efficient code generation in a region-based dynamic binary translator

Abstract: Region-based JIT compilation operates on translation units comprising multiple basic blocks and, possibly cyclic or conditional, control flow between these. It promises to reconcile aggressive code optimisation and low compilation latency in performancecritical dynamic binary translators. Whilst various region selection schemes and isolated code optimisation techniques have been investigated it remains unclear how to best exploit such regions for efficient code generation. Complex interactions with indirect br… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2018
2018

Publication Types

Select...
2
1

Relationship

3
0

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…This reduction in simulation efficiency can be reduced by using a on-FPGA embedded or soft processor to run the software fall-back model rather than require the high-latency communication to the host computer. The 16-core simulation achieves simulation rates up to 62 MIPS, and even the theoretical 100 MIPS throughput of the BlueSPARC engine is significantly slower than modern JIT compiled software simulators (typically several hundred MIPS per core, with results up to 1323 MIPS per core achieved by the current state of the art [47]). …”
Section: Protoflexmentioning
confidence: 99%
“…This reduction in simulation efficiency can be reduced by using a on-FPGA embedded or soft processor to run the software fall-back model rather than require the high-latency communication to the host computer. The 16-core simulation achieves simulation rates up to 62 MIPS, and even the theoretical 100 MIPS throughput of the BlueSPARC engine is significantly slower than modern JIT compiled software simulators (typically several hundred MIPS per core, with results up to 1323 MIPS per core achieved by the current state of the art [47]). …”
Section: Protoflexmentioning
confidence: 99%
“…This means that these benchmarks also measure the handling of self modifying code. Furthermore, when optimisations such as concurrent code generation [7] or region-based code generation [28] are applied, these benchmarks will help to measure the effectiveness of these techniques.…”
Section: B Benchmark Categoriesmentioning
confidence: 99%
“…Control flow can also be split into direct (where the branch target is known in advance, and is encoded as an absolute or relative path into the instruction) and indirect (where the branch target is read from memory or a register). A large amount of work has been done on optimising the various forms of control flow [28,20,15,17].…”
Section: B Benchmark Categoriesmentioning
confidence: 99%
See 1 more Smart Citation
“…Guest basic blocks are terminated at page boundaries for memory protection purposes. Normal control flow out of a block is optimized utilising techniques from Spink et al [2014], which includes directly chaining to other basic blocks that are part of the same memory page to avoid costly returns to the main execution loop. If a translation does not exist, or the destination does not live on the same page, then control is returned to the main execution loop, which will then handle the situation accordingly.…”
Section: Cpu Virtualizationmentioning
confidence: 99%