2018 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS) 2018
DOI: 10.1109/coolchips.2018.8373078
|View full text |Cite
|
Sign up to set email alerts
|

EMAXVR: A programmable accelerator employing near ALU utilization to DSA

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 5 publications
0
5
0
Order By: Relevance
“…In the hash update step, the final 256-bit hash value, which is divided into 8 chunks of 32-bit data HO 0 , HO 1 ,...,HO 7 , is computed by adding the initial hashes H 0 , H 1 ,...,H 7 to the loop hashes a, b, c, d, e, f, g, h, as illustrated in eq. (13). Third, data dependence among the loops is present.…”
Section: Background a Sha-256 Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…In the hash update step, the final 256-bit hash value, which is divided into 8 chunks of 32-bit data HO 0 , HO 1 ,...,HO 7 , is computed by adding the initial hashes H 0 , H 1 ,...,H 7 to the loop hashes a, b, c, d, e, f, g, h, as illustrated in eq. (13). Third, data dependence among the loops is present.…”
Section: Background a Sha-256 Algorithmmentioning
confidence: 99%
“…Despite these disadvantages, multicore CPUs and GPUs are currently considered to be the most applicable hardware platforms for calculating SHA-256 in Bitcoin mining and other blockchain networks. In another approach, the systolic array-based accelerator named EMAXVR in [13] and its improved version in [14] were applied to reduce the data access time by implementing local memory near the ALU. Although this platform achieves high performance on image processing and AI learning [13], [14], its performance for computing SHA-256 is very poor [15].…”
Section: Background a Sha-256 Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…Although systolic arrays are less programmable than von Neumann computing platforms, the configurability is promising for arranging many data streams among ALUs, LMMs, and external memory in parallel. The utilization ratio of the ALUs and the traffic between external memory and LMM are comparable with state-of-the-art DSAs [21]. However, the first problem is unavoidable: wide memory bandwidth.…”
Section: Problems With Conventional and Modern Systolic Arraysmentioning
confidence: 83%
“…We believe that with their high computational and memory resources, reprogrammable hardware design, low power consumption, and high optimization for parallel pipeline processes, FPGAs are well suited for Scrypt implementation. There are several high-performance architectures that can be applied on FPGAs to reduce the memory access time, such as the systolic-array-based accelerator called EMAXVR [41], [42] used in near-memory computing. However, despite exhibiting high performance in machine learning and image processing applications, they can achieve only poor performance when performing low-cost operator hash functions [43].…”
Section: Preliminary Idea and Motivation For The High-performance Multi Romix Scrypt Acceleratormentioning
confidence: 99%