2.2 A 978GOPS/W Flexible Streaming Processor for Real-Time Image Processing Applications in 22nm FDSOI

Smets, Sander; Goedemé, Toon; Mittal, Anurag; Verhelst, Marian

doi:10.1109/isscc.2019.8662346

Cited by 11 publications

(2 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the non-CNN image processing field, depth-first-like processing making use of line buffers has been more widespread [16][17][18][19][20][21]. Ref.…”

Section: State-of-the-art Discussion and Comparisonmentioning

confidence: 99%

Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution

Goetschalckx

Verhelst

2019

IEEE J. Emerg. Sel. Topics Circuits Syst.

Self Cite

View full text Add to dashboard Cite

Convolutional Neural Networks (CNNs) now also start to reach impressive performance on non-classification image processing tasks, such as denoising, demosaicing, super resolution and super slow motion. Consequently, CNNs are increasingly deployed on very high resolution images. However, the resulting high resolution feature maps pose unseen requirements on the memory system of neural network processing systems, as on-chip memories are too small to store high resolution feature maps, while off-chip memories are very costly in terms of I/O bandwidth and power. This paper first shows that the classical layer-by-layer inference approaches are bounded in their external I/O bandwidth vs. on-chip memory trade-off space, making it infeasible to scale up to very high resolutions at a reasonable cost. Next, we demonstrate how an alternative depth-first network computation can reduce I/O bandwidth requirements up to >200× for a fixed on-chip memory size or, alternatively, reduce on-chip memory requirements up to >10000× for a fixed I/O bandwidth limitation. We further introduce an enhanced depth-first method, exploiting both line buffers and tiling, to further improve the external I/O bandwidth vs. on-chip memory capacity trade-off, and quantify its improvements beyond the current state-of-the-art.

show abstract

“…In the non-CNN image processing field, depth-first-like processing making use of line buffers has been more widespread [16][17][18][19][20][21]. Ref.…”

Section: State-of-the-art Discussion and Comparisonmentioning

confidence: 99%

Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution

Goetschalckx

Verhelst

2019

IEEE J. Emerg. Sel. Topics Circuits Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Energy-efficiency and cameras. There has been recent work on designing more power-efficient camera sensors [8,35,54], mixed-signal vision integrated circuits [43,80] and proces-sors [63,70]. The general relation between power consumption versus resolution, color and SNR however still applies.…”

Section: Related Workmentioning

confidence: 99%

NeuriCam: Video Super-Resolution and Colorization Using Key Frames

Veluri¹,

Saffari²,

Pernu³

et al. 2022

Preprint

View full text Add to dashboard Cite

We present NeuriCam, a key-frame video super-resolution and colorization based system, to achieve low-power video capture from dual-mode IOT cameras. Our idea is to design a dual-mode camera system where the first mode is low power (1.1 mW) but only outputs gray-scale, low resolution and noisy video and the second mode consumes much higher power (100 mW) but outputs color and higher resolution images. To reduce total energy consumption, we heavily duty cycle the high power mode to output an image only once every second. The data from this camera system is then wirelessly streamed to a nearby plugged-in gateway, where we run our real-time neural network decoder to reconstruct a higher resolution color video. To achieve this, we introduce an attention feature filter mechanism that assigns different weights to different features, based on the correlation between the feature map and contents of the input frame at each spatial location. We design a wireless hardware prototype using off-the-shelf cameras and address practical issues including packet loss and perspective mismatch. Our evaluation shows that our dual-camera hardware reduces camera energy consumption while achieving an average gray-scale PSNR gain of 3.7 dB over prior video super resolution methods and 5.6 dB RGB gain over existing color propagation methods.Open-source code: https://github.com/vb000/NeuriCam

show abstract

A New Low Power Schema for Stream Processors Front-End with Power-Aware DA-Based FIR Filters by Investigation of Image Transitions Sparsity

Ghamkhari

Ghaznavi‐Ghoushchi

2021

Circuits Syst Signal Process

View full text Add to dashboard Cite

2.2 A 978GOPS/W Flexible Streaming Processor for Real-Time Image Processing Applications in 22nm FDSOI

Cited by 11 publications

References 3 publications

Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution

Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution

NeuriCam: Video Super-Resolution and Colorization Using Key Frames

A New Low Power Schema for Stream Processors Front-End with Power-Aware DA-Based FIR Filters by Investigation of Image Transitions Sparsity

Contact Info

Product

Resources

About