Layer-by-layer Quantization Method for Neural Network Parameters

Ma, Jiali; Zhu, Zhiqiang; Dai, Leyu; Guo, Songhui

doi:10.1145/3333581.3333589

Cited by 3 publications

(1 citation statement)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Through parameter quantization, the resolution of the parameters can be reduced to 16-bit, 8-bit, 4-bit, and even 1-bit with little loss of the accuracy in some tasks. [26][27][28] Prakosa et al 29 adopted the K-D method to improve the performance of the pruned network. Blakeney et al 30 proposed a parallel block-wise K-D method to compress the deep neural networks.…”

Section: Introductionmentioning

confidence: 99%

Iterative-AMC: a novel model compression and structure optimization method in mechanical system safety monitoring

Peng

et al. 2023

Structural Health Monitoring

View full text Add to dashboard Cite

With the rapid development of artificial intelligence, various fault diagnosis methods based on the deep neural networks have made great advances in mechanical system safety monitoring. To get the high accuracy for the fault diagnosis, researchers tend to adopt the deep network layers and amount of neurons or kernels in each layer. This results in a large redundancy and the structure uncertainty of the fault diagnosis networks. Moreover, it is hard to deploy these networks on the embedded platforms because of the large scales of the network parameters. This brings huge challenges to the practical application of the intelligent diagnosis algorithms. To solve the above problems, an iterative automatic machine compression method, named Iterative-AMC, is proposed in this paper. The proposed method aims to automatically compress and optimize the structure of the large-scale neural networks. Experiments are carried out based on two test benches. With the proposed Iterative-AMC method, the problems of the parameter redundancy and the structure uncertainty can be addressed. The scale of the original network can be greatly compressed, and the compressed fault diagnosis network is successfully deployed on a small-scale FPGA chip.

show abstract