2016
DOI: 10.1109/mm.2016.8
|View full text |Cite
|
Sign up to set email alerts
|

Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…Their approach focuses on an efficient instruction offloading technique and their framework introduces vault-level parallelism to improve computation throughput. In Reference [2] authors utilize a number of lightweight cores in conjunction with commodity two-dimensional (2D) DRAMs to explore a general-purpose NDP designs. They manage to implement an NDP execution framework that utilizes the same ISA with the host processor, and, thus, it is fully compatible with existing commercial processors.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Their approach focuses on an efficient instruction offloading technique and their framework introduces vault-level parallelism to improve computation throughput. In Reference [2] authors utilize a number of lightweight cores in conjunction with commodity two-dimensional (2D) DRAMs to explore a general-purpose NDP designs. They manage to implement an NDP execution framework that utilizes the same ISA with the host processor, and, thus, it is fully compatible with existing commercial processors.…”
Section: Related Workmentioning
confidence: 99%
“…We extend the RISC-V ISA to include the necessary functionalities to support the NDP paradigm. Processor ISA extension for NDP is also considered in Reference [2] where authors argue that such an approach provides compatibility with existing processing platforms. To this end, we implement jump-and-link-PIM (JalPim), an instruction that behaves as the original jump-andlink (Jal) instruction, and, thus, it triggers a function call.…”
Section: Host System Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…In other related work [17] authors incorporate heterogeneous reconfigurable logic arrays, which behave like CGRAs in order to improve throughput and reduce the power consumption of target applications. CGRA capabilities are also explored in [18] along with different TSV interconnection networks in order to find the optimal CGRA-TSV combination that leads to the higher speedup improvement. A common target application of CGRAs and NDP is the training and inference of deep neural networks as previous works in [19] and [20] demonstrate.…”
Section: Related Workmentioning
confidence: 99%
“…On the contrary, our work does not require any profiling operation prior to code execution due to the fact that the CGRA is designed for loop acceleration and thus, it can support any issued loop without additional effort. Further, authors in [8] [16] [17] and [18] utilize CGRAs in conjunction with the NDP paradigm but their focus shifts to different aspects of the NDP execution paradigm. Under this premise, previous works lack the application mapping approach or the loop acceleration focus we employ as they do not utilize the CGRA network to execute instructions in an iterative way, i.e.…”
Section: Related Workmentioning
confidence: 99%