2018
DOI: 10.1002/spe.2573
|View full text |Cite
|
Sign up to set email alerts
|

Efficient and retargetable SIMD translation in a dynamic binary translator

Abstract: The single-instruction multiple-data (SIMD) computing capability of modern processors is continually improved to deliver ever better performance and power efficiency. For example, Intel has increased SIMD register lengths from 128 bits in streaming SIMD extension to 512 bits in AVX-512. The ARM scalable vector extension supports SIMD register length up to 2048 bits and includes predicated instructions. However, SIMD instruction translation in dynamic binary translation has not received similar attention. For e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 46 publications
0
2
0
Order By: Relevance
“…A number of static binary lifters target the LLVM IR [9,12,14,67,73] for analysis and transformation. Previous work provided correct translations for SIMD [24,25] or floating point instructions [25,75]. However, these translation tools do not support concurrency.…”
Section: Overall Impact On Code Sizementioning
confidence: 99%
See 1 more Smart Citation
“…A number of static binary lifters target the LLVM IR [9,12,14,67,73] for analysis and transformation. Previous work provided correct translations for SIMD [24,25] or floating point instructions [25,75]. However, these translation tools do not support concurrency.…”
Section: Overall Impact On Code Sizementioning
confidence: 99%
“…Crucially, the translation must preserve the semantics of the original binary, as specified by the original architecture, whilst also optimizing the target binary in a discernible way. Although SBT tools have gained popularity [24,25,74,75], their support of several advanced architectural features is often limited. For example, Microsoft's binary lifting prototype, called mctoll, was unable to lift the programs used in our evaluation.…”
Section: Introductionmentioning
confidence: 99%
“…Cota et al [8] increased floating-point (FP) emulation performance by surrounding the use of host FP unit with a minimal amount of non-FP code. Spink et al [9] and Fu et al [10] translated guest SIMD instructions to host SIMD instructions, aiming to exploit ISA-specific features. Clark et al [11] proposed a register-mapping approach to reduce memory access and context switching overhead.…”
Section: Introductionmentioning
confidence: 99%