2018
DOI: 10.1145/3173456
|View full text |Cite
|
Sign up to set email alerts
|

Improving SIMD Parallelism via Dynamic Binary Translation

Abstract: Recent trends in SIMD architecture have tended toward longer vector lengths, and more enhanced SIMD features have been introduced in newer vector instruction sets. However, legacy or proprietary applications compiled with short-SIMD ISA cannot benefit from the long-SIMD architecture that supports improved parallelism and enhanced vector primitives, resulting in only a small fraction of potential peak performance. This article presents a dynamic binary translation technique that enables short-SIMD binaries to e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 35 publications
0
10
0
Order By: Relevance
“…The translation effect is better, and the results are consistent with the experimental expectation. Hong et al (2018) converted the short S.I.M.D. command into a discontinuous phrase and translated it, which greatly improved its speed [26].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The translation effect is better, and the results are consistent with the experimental expectation. Hong et al (2018) converted the short S.I.M.D. command into a discontinuous phrase and translated it, which greatly improved its speed [26].…”
Section: Discussionmentioning
confidence: 99%
“…Hong et al (2018) converted the short S.I.M.D. command into a discontinuous phrase and translated it, which greatly improved its speed [26]. Miura et al (2016) proposed a method to remember key discontinuous phrases in triangulation stage, and used key language model as additional information source in the transformation stage.…”
Section: Discussionmentioning
confidence: 99%
“…Dynamic rewriting of SIMD instructions has been proposed in [5,16] to find SIMD mappings between host and guest architectures in dynamic binary translation. The Dynamic Binary Translation system proposed in [7] details a technique to widen SIMD instructions during this mapping. It only targets loops and relies on recovery of LLVM IR from the binary, which is imprecise and can lead to spurious dependencies.…”
Section: Hand-written Kernel (Avx2)mentioning
confidence: 99%
“…It only targets loops and relies on recovery of LLVM IR from the binary, which is imprecise and can lead to spurious dependencies. Compared to [7], Revec is implemented as a compiler level transformation pass and inherently has access to loop structures without the need to recover them from the binary, making Revec more precise. Further, Revec applies to loop-free segments of code, making it more general than [7] -the Simd NeuralConvert benchmark that Revec greatly accelerates depends on this capability.…”
Section: Hand-written Kernel (Avx2)mentioning
confidence: 99%
“…In our work we rewrite binaries dynamically at runtime, hence allowing for optimizing long running applications, without the need for restart. Some techniques require uplifting binary into intermediate representation (IR) to perform analysis and preparations for parallelization (and possibly optimization itself) statically [10], [11], [12]. While this simplifies the transformation process, it adds extra time overhead for uplifting process.…”
Section: Binary Parallelizationmentioning
confidence: 99%