中国科学院机构知识库网格系统: FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs

FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs

文献类型：期刊论文


作者	Li, Fanrong2,3 ; Li, Gang2,4 ; Mo, Zitao2,4 ; He, Xiangyu2,4 ; Cheng, Jian1,2,5
刊名	IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
出版日期	2020-11-01
卷号	39 期号:11 页码:3589-3600
关键词	Accelerator architecture convolutional neural networks (CNNs) sparsity
ISSN号	0278-0070
DOI	10.1109/TCAD.2020.3012212
通讯作者	Cheng, Jian(jcheng@nlpr.ia.ac.cn)
英文摘要	Sparsity, as an intrinsic property of convolutional neural networks (CNNs), has been widely employed for hardware acceleration, and many customized accelerators tailored for sparse weights or activations have been proposed in these years. However, the irregular sparse patterns introduced by both weights and activations are much more challenging for efficient computation. For example, due to the issues of access contention, workload imbalance, and tile fragmentation, the state-of-the-art sparse accelerator SCNN fails to fully leverage the benefits of sparsity, leading to nonoptimal results for both speedup and energy efficiency. In this article, we propose an efficient sparse CNN accelerator for both weights and activations, namely fine-grained systolic accelerator (FSA), which jointly optimizes both hardware dataflow and software partitioning and scheduling strategy. Specifically, to deal with the access contentions problem, we present a fine-grained systolic dataflow, in which the activations move rhythmically along the horizontal processing element array while the weights are fed into the array in a fine-grained order. We then propose a hybrid network partitioning strategy that sets different partitioning strategies for different layers to balance the workload and alleviate the fragmentation problem caused by both sparse weights and activations. Finally, we present a scheduling search strategy to find the optimized schedules for neural networks, which can further improve energy efficiency. Extensive evaluations show that the proposed FSA consistently outperforms SCNN over AlexNet, VGGNet, GoogLeNet, and ResNet with an average speedup of 1.74x and up to 13.86x energy efficiency.
WOS关键词	TIME
资助项目	National Natural Science Foundation of China[61972396] ; National Natural Science Foundation of China[61876182] ; National Natural Science Foundation of China[61906193] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Advance Research Program[31511130301]
WOS研究方向	Computer Science ; Engineering
语种	英语
WOS记录号	WOS:000587712700037
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构	National Natural Science Foundation of China ; Strategic Priority Research Program of Chinese Academy of Science ; Advance Research Program
源URL	[http://ir.ia.ac.cn/handle/173211/41756]
专题	类脑芯片与系统研究
通讯作者	Cheng, Jian
作者单位	1.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China 4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 5.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Li, Fanrong,Li, Gang,Mo, Zitao,et al. FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2020,39(11):3589-3600.
APA	Li, Fanrong,Li, Gang,Mo, Zitao,He, Xiangyu,&Cheng, Jian.(2020).FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,39(11),3589-3600.
MLA	Li, Fanrong,et al."FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 39.11(2020):3589-3600.

入库方式： OAI收割

来源：自动化研究所

下载0

FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs

其他版本