中国科学院机构知识库网格系统: Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning

Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning

文献类型：期刊论文


作者	Li, Guangli 1,2; Ma, Xiu 3; Wang, Xueying 1; Yue, Hengshan 3; Li, Jiansong 1,2; Liu, Lei 1; Feng, Xiaobing 1,2; Xue, Jingling 4
刊名	JOURNAL OF SYSTEMS ARCHITECTURE
出版日期	2022-03-01
卷号	124 页码:11
关键词	Edge intelligence Deep learning Neural network compression
ISSN号	1383-7621
DOI	10.1016/j.sysarc.2022.102431
英文摘要	While deep learning has shown superior performance in various intelligent tasks, it is still a challenging problem to deploy sophisticated models on resource-limited edge devices. Filter pruning performs a system independent optimization, which shrinks a neural network model into a thinner one, providing an attractive solution for efficient on-device inference. Prevailing approaches usually utilize fixed pruning rates for the whole neural network model to reduce the optimization space of filter pruning. However, the filters of different layers may have different sensitivities for model inference and therefore a flexible rate setting of pruning can potentially further increase the accuracy of compressed models. In this paper, we propose FlexPruner, a novel approach for compressing and accelerating neural network models via flexible-rate filter pruning. Our approach follows a greedy-based strategy to select the filters to be pruned and performs an iterative loss-aware pruning process, thereby achieving a remarkable accuracy improvement over existing methods when numerous filters are pruned. Evaluation with state-of-the-art residual neural networks on six representative intelligent edge accelerators demonstrates the effectiveness of FlexPruner, which decreases the accuracy degradation of pruned models by leveraging flexible pruning rates and achieves practical speedups for on-device inference.
资助项目	National Key R&D Pro-gram of China[2017YFB1003103] ; National Natural Science Foundation of China[61872043] ; National Natural Science Foundation of China[61802368] ; Science Fund for Creative Research Groups of the National Natural Science Foundation of China[61521092]
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000782573200016
出版者	ELSEVIER
源URL	[http://119.78.100.204/handle/2XEOYT63/18874]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Li, Guangli
作者单位	1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China 2.Univ Chinese Acad Sci, Beijing, Peoples R China 3.Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China 4.Univ New S Wales, Sch Engn & Comp Sci, Sydney, NSW, Australia
推荐引用方式 GB/T 7714	Li, Guangli,Ma, Xiu,Wang, Xueying,et al. Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning[J]. JOURNAL OF SYSTEMS ARCHITECTURE,2022,124:11.
APA	Li, Guangli.,Ma, Xiu.,Wang, Xueying.,Yue, Hengshan.,Li, Jiansong.,...&Xue, Jingling.(2022).Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning.JOURNAL OF SYSTEMS ARCHITECTURE,124,11.
MLA	Li, Guangli,et al."Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning".JOURNAL OF SYSTEMS ARCHITECTURE 124(2022):11.

入库方式： OAI收割

来源：计算技术研究所

下载0

Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning

其他版本