中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs

文献类型:期刊论文

作者Wang, Xueying4,8; Li, Guangli2,3,6,7; Jia, Zhen1,5; Feng, Xiaobing2,3,6,7; Wang, Yida1,5
刊名ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
出版日期2024-03-01
卷号21期号:1页码:26
关键词Deep learning winograd convolution low-precision computation
ISSN号1544-3566
DOI10.1145/3632956
英文摘要Low-precision computation has emerged as one of the most effective techniques for accelerating convolutional neural networks and has garnered widespread support on modern hardware. Despite its effectiveness in accelerating convolutional neural networks, low-precision computation has not been commonly applied to fast convolutions, such as the Winograd algorithm, due to numerical issues. In this article, we propose an effective quantizedWinograd convolution, named LoWino, which employs an in-side quantization method in theWinograd domain to reduce the precision loss caused by transformations. Meanwhile, we present an efficient implementation that integrates well-designed optimization techniques, allowing us to fully exploit the capabilities of low-precision computation on modern CPUs. We evaluate LoWino on two Intel Xeon Scalable Processor platforms with representative convolutional layers and neural network models. The experimental results demonstrate that our approach can achieve an average of 1.84x and 1.91x operator speedups over state-of-the-art implementations in the vendor library while preserving accuracy loss at a reasonable level.
资助项目National Key R&D Program of China[2021ZD0110101] ; National Natural Science Foundation of China[62090024] ; National Natural Science Foundation of China[62232015] ; National Natural Science Foundation of China[62302479] ; China Postdoctoral Science Foundation[2023M733566] ; Innovation Funding of ICT, CAS[E361010]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:001193465400005
出版者ASSOC COMPUTING MACHINERY
源URL[http://119.78.100.204/handle/2XEOYT63/38764]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Guangli
作者单位1.AmazonWeb Serv, 2795 Augustine Dr, Santa Clara, CA 95054 USA
2.Univ Chinese Acad Sci, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China
4.Beijing Univ Posts & Telecommun, 10 Xitucheng Rd, Beijing 100876, Peoples R China
5.Amazon Web Serv, Seattle, WA USA
6.Univ Chinese Acad Sci, Beijing, Peoples R China
7.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
8.Beijing Univ Posts & Telecommun, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Wang, Xueying,Li, Guangli,Jia, Zhen,et al. Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2024,21(1):26.
APA Wang, Xueying,Li, Guangli,Jia, Zhen,Feng, Xiaobing,&Wang, Yida.(2024).Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,21(1),26.
MLA Wang, Xueying,et al."Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 21.1(2024):26.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。