This paper presents a design methodology for hardware synthesis of application-specific logic-in-memory (LiM) blocks. Logic-in-memory designs tightly integrate specialized computation logic with embedded memory, enabling more localized computation, thus save energy consumption. As a demonstration, we present an end-to-end design framework to automatically synthesize an interpolation based logic-inmemory block named interpolation memory, which combines a seed table with simple arithmetic logic to efficiently evaluate functions. In order to support multiple consecutive seed data access that is required in the interpolation operation, we synthesize the physical memory into the novel rectangularaccess smart memory blocks. We evaluated a large design space of interpolation memories in sub-20 nm commercial CMOS technology by using the proposed design framework. Furthermore, we implemented a logic-in-memory based computed tomography (CT) medical image reconstruction system and our experimental results show that the logic-in-memory computing method achieves orders of magnitude of energy saving compared with the traditional in-processor computing.
This paper focuses on the trade-off between flexibility and efficiency in specialized computing. We observe that specialized units achieve most of their efficiency gains by tuning data storage and compute structures and their connectivity to the data-flow and data-locality patterns in the kernels. Hence, by identifying key data-flow patterns used in a domain, we can create efficient engines that can be programmed and reused across a wide range of applications.
We present an example, the Convolution Engine (CE), specialized for the convolution-like data-flow that is common in computational photography, image processing, and video processing applications. CE achieves energy efficiency by capturing data reuse patterns, eliminating data transfer overheads, and enabling a large number of operations per memory access. We quantify the tradeoffs in efficiency and flexibility and demonstrate that CE is within a factor of 2-3x of the energy and area efficiency of custom units optimized for a single kernel. CE improves energy and area efficiency by 8-15x over a SIMD engine for most applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.