中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs

文献类型:期刊论文

作者Li, Fanrong2,3; Li, Gang2,4; Mo, Zitao2,4; He, Xiangyu2,4; Cheng, Jian1,2,5
刊名IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
出版日期2020-11-01
卷号39期号:11页码:3589-3600
关键词Accelerator architecture convolutional neural networks (CNNs) sparsity
ISSN号0278-0070
DOI10.1109/TCAD.2020.3012212
通讯作者Cheng, Jian(jcheng@nlpr.ia.ac.cn)
英文摘要Sparsity, as an intrinsic property of convolutional neural networks (CNNs), has been widely employed for hardware acceleration, and many customized accelerators tailored for sparse weights or activations have been proposed in these years. However, the irregular sparse patterns introduced by both weights and activations are much more challenging for efficient computation. For example, due to the issues of access contention, workload imbalance, and tile fragmentation, the state-of-the-art sparse accelerator SCNN fails to fully leverage the benefits of sparsity, leading to nonoptimal results for both speedup and energy efficiency. In this article, we propose an efficient sparse CNN accelerator for both weights and activations, namely fine-grained systolic accelerator (FSA), which jointly optimizes both hardware dataflow and software partitioning and scheduling strategy. Specifically, to deal with the access contentions problem, we present a fine-grained systolic dataflow, in which the activations move rhythmically along the horizontal processing element array while the weights are fed into the array in a fine-grained order. We then propose a hybrid network partitioning strategy that sets different partitioning strategies for different layers to balance the workload and alleviate the fragmentation problem caused by both sparse weights and activations. Finally, we present a scheduling search strategy to find the optimized schedules for neural networks, which can further improve energy efficiency. Extensive evaluations show that the proposed FSA consistently outperforms SCNN over AlexNet, VGGNet, GoogLeNet, and ResNet with an average speedup of 1.74x and up to 13.86x energy efficiency.
WOS关键词TIME
资助项目National Natural Science Foundation of China[61972396] ; National Natural Science Foundation of China[61876182] ; National Natural Science Foundation of China[61906193] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Advance Research Program[31511130301]
WOS研究方向Computer Science ; Engineering
语种英语
WOS记录号WOS:000587712700037
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构National Natural Science Foundation of China ; Strategic Priority Research Program of Chinese Academy of Science ; Advance Research Program
源URL[http://ir.ia.ac.cn/handle/173211/41756]  
专题类脑芯片与系统研究
通讯作者Cheng, Jian
作者单位1.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
5.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Fanrong,Li, Gang,Mo, Zitao,et al. FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2020,39(11):3589-3600.
APA Li, Fanrong,Li, Gang,Mo, Zitao,He, Xiangyu,&Cheng, Jian.(2020).FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,39(11),3589-3600.
MLA Li, Fanrong,et al."FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 39.11(2020):3589-3600.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。