Network compression and faster inference using spatial basis filters

Miles, Roy; Mikolajczyk, Krystian

doi:10.48550/arxiv.2110.12844

Cited by 1 publication

(2 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pruning [19,71,78,79,132,134,140,171,200,265,288] Quantization [19,68,90,134,166,179,291,307,311,314] Knowledge Distillation [29,41,42,80,83,88,95,170,186,195,220,228,231,239,257,266,267,274,295,296,300,312] Low rank factorization [76,98,119,168,190,196,210,292] Conditional Computation…”

Section: Model Compressionmentioning

confidence: 99%

“…(4) Low Rank Factorization: Low rank factorization is a technique which helps in condensing the dense parameter weights of a DNN [98,190], limiting the number of computations done in convolutional layers [76,119,168,196] or both [210,292]. This technique is based on the idea of creating another low-rank matrix that can approximate the dense metrics of the parameter of a DNN, convolutional kernels, or both.…”

Section: Model Compressionmentioning

confidence: 99%

See 1 more Smart Citation

Enabling Deep Learning for All-in EDGE paradigm

Joshi¹,

Hasanuzzaman²,

Thapa³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep Learning-based models have been widely investigated, and they have demonstrated significant performance on non-trivial tasks such as speech recognition, image processing, and natural language understanding. However, this is at the cost of substantial data requirements. Considering the widespread proliferation of edge devices (e.g., Internet of Things devices) over the last decade, Deep Learning in the edge paradigm, such as device-cloud integrated platforms, is required to leverage its superior performance. Moreover, it is suitable from the data requirements perspective in the edge paradigm because the proliferation of edge devices has resulted in an explosion in the volume of generated and collected data. However, there are difficulties due to other requirements such as high computation, high latency, and high bandwidth caused by Deep Learning applications in real-world scenarios. In this regard, this survey paper investigates Deep Learning at the edge, its architecture, enabling technologies, and model adaption techniques, where edge servers and edge devices participate in deep learning training and inference. For simplicity, we call this paradigm the All-in EDGE paradigm. Besides, this paper presents the key performance metrics for Deep Learning at the All-in EDGE paradigm to evaluate various deep learning techniques and choose a suitable design. Moreover, various open challenges arising from the deployment of Deep Learning at the All-in EDGE paradigm are identified and discussed.

show abstract