中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression

文献类型:期刊论文

作者Ruan, Xiaofeng1,6; Liu, Yufan1,6; Yuan, Chunfeng6; Li, Bing5,6; Hu, Weiming1,2,6; Li, Yangxi4; Maybank, Stephen3
刊名IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
出版日期2020
期号0页码:0
关键词Data-driven low-rank decomposition model compression and acceleration structured pruning
英文摘要

Model compression methods have become popular in recent years, which aim to alleviate the heavy load of deep neural networks (DNNs) in real-world applications. However, most of the existing compression methods have two limitations: 1) they usually adopt a cumbersome process, including pertaining, training with a sparsity constraint, pruning/decomposition, and fine-tuning. Moreover, the last three stages are usually iterated multiple times. 2) The models are pretrained under explicit sparsity or low-rank assumptions, which are difficult to guarantee wide appropriateness. In this article, we propose an efficient decomposition and pruning (EDP) scheme via constructing a compressed-aware block that can automatically minimize the rank of the weight matrix and identify the redundant channels. Specifically, we embed the compressed-aware block by decomposing one network layer into two layers: a new weight matrix layer and a coefficient matrix layer. By imposing regularizers on the coefficient matrix, the new weight matrix learns to become a low-rank basis weight, and its corresponding channels become sparse. In this way, the proposed compressedaware block simultaneously achieves low-rank decomposition and channel pruning by only one single data-driven training stage. Moreover, the network of architecture is further compressed and optimized by a novel Pruning & Merging (PM) module which prunes redundant channels and merges redundant decomposed layers. Experimental results (17 competitors) on different data sets and networks demonstrate that the proposed EDP achieves a high compression ratio with acceptable accuracy degradation and outperforms state-of-the-arts on compression rate, accuracy, inference time, and run-time memory.

语种英语
源URL[http://ir.ia.ac.cn/handle/173211/44804]  
专题自动化研究所_模式识别国家重点实验室_视频内容安全团队
通讯作者Yuan, Chunfeng; Li, Bing
作者单位1.School of Artificial Intelligence, University of Chinese Academy of Sciences
2.CAS Center for Excellence in Brain Science and Intelligence Technology
3.Department of Computer Science and Information Systems, Birkbeck College, University of London
4.National Computer Network Emergency Response Technical Team/Coordination Center of China
5.PeopleAI Inc.
6.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Ruan, Xiaofeng,Liu, Yufan,Yuan, Chunfeng,et al. EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2020(0):0.
APA Ruan, Xiaofeng.,Liu, Yufan.,Yuan, Chunfeng.,Li, Bing.,Hu, Weiming.,...&Maybank, Stephen.(2020).EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(0),0.
MLA Ruan, Xiaofeng,et al."EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS .0(2020):0.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。