Improving Extreme Low-bit Quantization with Soft Threshold
文献类型:期刊论文
作者 | Xu WX(许伟翔)1,2![]() ![]() ![]() |
刊名 | IEEE Transactions on Circuits and Systems for Video Technology
![]() |
出版日期 | 2022 |
页码 | 1549 - 1563 |
英文摘要 | Deep neural networks executing with low precision at inference time can gain acceleration and compression advantages over their high-precision counterparts, but need to overcome the challenge of accuracy degeneration as the bit-width decreases. This work focuses on under 4-bit quantization that has a significant accuracy degeneration. We start with ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. We find that the hard threshold ∆ introduced in previous ternary networks for determining quantization intervals and the suboptimal solution of ∆ limit the performance of the ternary model. To alleviate it, we present Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine ternarized values instead of depending on a hard threshold. Based on it, we further generalize the idea of soft threshold from ternarization to arbitrary bitwidth, named Soft Threshold Quantized Networks (STQN). We observe that previous quantization relies on the rounding-tonearest function, constraining the quantization solution space and leading to a significant accuracy degradation, especially in lowbit (≤ 3-bits) quantization. Instead of relying on the traditional rounding-to-nearest function, STQN is able to determine quantization intervals by itself adaptively. Accuracy experiments on image classification, object detection and instance segmentation, as well as efficiency experiments on field-programmable gate array (FPGA) demonstrate that the proposed framework can achieve a prominent tradeoff between accuracy and efficiency. Code is available at: https://github.com/WeixiangXu/STTN. |
源URL | [http://ir.ia.ac.cn/handle/173211/52073] ![]() |
专题 | 类脑芯片与系统研究 |
作者单位 | 1.中国科学院自动化研究所 2.中国科学院大学 |
推荐引用方式 GB/T 7714 | Xu WX,Wang PS,Cheng J. Improving Extreme Low-bit Quantization with Soft Threshold[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022:1549 - 1563. |
APA | Xu WX,Wang PS,&Cheng J.(2022).Improving Extreme Low-bit Quantization with Soft Threshold.IEEE Transactions on Circuits and Systems for Video Technology,1549 - 1563. |
MLA | Xu WX,et al."Improving Extreme Low-bit Quantization with Soft Threshold".IEEE Transactions on Circuits and Systems for Video Technology (2022):1549 - 1563. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。