GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices Based on Fine-Grained Structured Weight Sparsity

Niu, Wei; Li, Zhengang; Ma, Xiaolong; Dong, Peiyan; Zhou, Gang; Qian, Xuehai; Lin, Xue; Wang, Yanzhi; Ren, Bin

doi:10.1109/tpami.2021.3089687

Cited by 11 publications

(4 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Many companies have developed dedicated Neural Processing Units (NPUs) for mobile devices, which can process many trained Deep Neural Networks (DNNs) based applications in real-time. Even though training a DNN may suffer from a long delay, testing the network can be done in real-time [3,4]. Various methods have been proposed in the literature that focused on leveraging neural network based coding models for image and video compression.…”

Section: Introductionmentioning

confidence: 99%

A new way of video compression via forward-referencing using deep learning

Rajin¹,

Murshed²,

Paul³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

A new way of video compression via forward-referencing using deep learning

Rajin¹,

Murshed²,

Paul³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…RigL [90], ITOP [91], SET [104], DSR [89], and MEST [86], is provided in Tab. Three main sparsity schemes introduced in the area of network pruning consists of unstructured [105][106][107], structured [3,45,[108][109][110][111][112][113][114][115][116][117][118][119], and fine-grained structured pruning [120][121][122][123][124][125][126][127][128][129].…”

Section: Discussionmentioning

confidence: 99%

“…As a result, processing one bit of all active inputs requires (128 × 1 1.2GHz = 106.6ns). Instead, FORMS employs four 4-bit ADCs (within the same area of an 8-bit ADC but 1.8× times higher frequency) to compute 128 dot-products, which results in a cycle time of 128 4 × 1 2.1GHz = 15 ns. As a result, FORMS improves the cycle time that assists to increase the throughput.…”

Section: Overall Architecturementioning

confidence: 99%