2020
DOI: 10.1609/aaai.v34i04.5912
|View full text |Cite
|
Sign up to set email alerts
|

RTN: Reparameterized Ternary Network

Abstract: To deploy deep neural networks on resource-limited devices, quantization has been widely explored. In this work, we study the extremely low-bit networks which have tremendous speed-up, memory saving with quantized activation and weights. We first bring up three omitted issues in extremely low-bit networks: the squashing range of quantized values; the gradient vanishing during backpropagation and the unexploited hardware acceleration of ternary networks. By reparameterizing quantized activation and weights vect… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 24 publications
(28 citation statements)
references
References 21 publications
0
22
0
1
Order By: Relevance
“…The weights are ternarized to {+1, 0, -1} by comparing with trained or given thresholds as equation ( 4) shows, where w and w t are the original and ternarized weight values, and T H low and T H high are the thresholds. Modern TWNs train the weights to be ternary values [14], [15], [16], so the weights of target neural networks are already quantized into 2-bit numbers when the training finishes.…”
Section: A Overviewmentioning
confidence: 99%
See 2 more Smart Citations
“…The weights are ternarized to {+1, 0, -1} by comparing with trained or given thresholds as equation ( 4) shows, where w and w t are the original and ternarized weight values, and T H low and T H high are the thresholds. Modern TWNs train the weights to be ternary values [14], [15], [16], so the weights of target neural networks are already quantized into 2-bit numbers when the training finishes.…”
Section: A Overviewmentioning
confidence: 99%
“…BWNs quantize the weights of CNNs into {+1, -1} to replace the computation-intensive multiplication operations with addition and subtraction operations for high speedup, but this aggressive quantization also leads to lower accuracy. As both the accuracy and the speed of CNNs matter, other quantization methods, including 8-bit [10], [11] and 4-bit [12], [13] integer quantization (INT8 and INT4) and ternary quantization [14], [15], [16], are proposed to do a trade-off between the speed and accuracy as Table . I shows.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…XNOR-Net++ [27] 则将二值卷积过程中的权值与激活的尺度因子融合为一个可通过网络自适应学习的参数, 解决 XNOR 中由于固定尺度因子造成的激活值被限制在固定区间的问题. 与 XNOR-Net++ 方法类似, RTN [28] 网 络提出了重新参数化量化数值的总体流程, 量化后的激活由一个尺度因子重新调整大小, 同时由一个 偏移因子重新调整区间, 以实现更高的网络容量. 此外, 对于 1-bit 网络而言, 性能对于激活值的特征 分布十分敏感.…”
Section: 基于前向过程改进的二值卷积神经网络unclassified
“…Deep neural networks (DNNs) have achieved remarkable success in a wide range of applications, however, they suffer from substantial computation and energy cost. In order to obtain light-weighted DNNs, network compression techniques have been widely developed in recent years, including network pruning (He, Zhang, and Sun 2017;Luo, Wu, and Lin 2017;Wen et al 2019), quantization (Han, Mao, and Dally 2016;Wu et al 2016;Li et al 2020) and knowledge distillation (Hinton, Vinyals, and Dean 2015;Romero et al 2014).…”
Section: Introductionmentioning
confidence: 99%