中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
General Purpose Deep Learning Accelerator Based on Bit Interleaving

文献类型:期刊论文

作者Chang, Liang1; Lu, Hang2,3,4; Li, Chenglong1; Zhao, Xin1; Hu, Zhicheng1; Zhou, Jun1; Li, Xiaowei2,3,4
刊名IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
出版日期2024-05-01
卷号43期号:5页码:1470-1483
关键词Synchronization Parallel processing Computational modeling Training Pragmatics Power demand Hardware acceleration Accelerator bit-level sparsity deep neural network (DNN)
ISSN号0278-0070
DOI10.1109/TCAD.2023.3342728
英文摘要Along with the rapid evolution of deep neural networks, the ever-increasing complexity imposes formidable computation intensity on the hardware accelerator. In this article, we propose a novel computing philosophy called "bit interleaving" and the associate accelerator couple called "Bitlet" and Bitlet-X to maximally exploit the bit-level sparsity. Apart from the existing bit-serial/parallel accelerators, Bitlet leverages the abundant "sparsity parallelism" in the parameters to enforce the inference acceleration. Bitlet is versatile by supporting diverse precisions on a single platform, including floating-point 32 and fixed-point from 1b to 24b . The versatility enables Bitlet feasible for both efficient inference and training. Besides, by updating the key compute engine in the accelerator, Bitlet-X could furthermore improve the peak power consumption and efficiency for the inference-only scenario, with competitive accuracy. Empirical studies on 12 domain-specific deep learning applications highlight the following results: 1) up to $81x /21x energy efficiency improvement for training/inference over recent high-performance GPUs; 2) up to 15x /8x higher speedup/efficiency over state-of-the-art fixed-point accelerators; 3) 1.5 mm(2) area and scalable power consumption from 570 mW (fp32) to 432 mW (16b) and 365 mW (8b) @28 -nm TSMC; 4) 1.3x improvement of the peak power efficiency for the Bitlet-X over Bitlet; and 5) highly configurable justified by the ablation and sensitivity studies.
资助项目National Natural Science Foundation of China
WOS研究方向Computer Science ; Engineering
语种英语
WOS记录号WOS:001225897600012
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://119.78.100.204/handle/2XEOYT63/40062]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Lu, Hang
作者单位1.Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China
3.Zhongguancun Lab, Beijing 100081, Peoples R China
4.Shanghai Innovat Ctr Processor Technol, Shanghai 200120, Peoples R China
推荐引用方式
GB/T 7714
Chang, Liang,Lu, Hang,Li, Chenglong,et al. General Purpose Deep Learning Accelerator Based on Bit Interleaving[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2024,43(5):1470-1483.
APA Chang, Liang.,Lu, Hang.,Li, Chenglong.,Zhao, Xin.,Hu, Zhicheng.,...&Li, Xiaowei.(2024).General Purpose Deep Learning Accelerator Based on Bit Interleaving.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,43(5),1470-1483.
MLA Chang, Liang,et al."General Purpose Deep Learning Accelerator Based on Bit Interleaving".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 43.5(2024):1470-1483.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。