Accelerate Convolutional Neural Network with a customized VLIW DSP
文献类型:会议论文
| 作者 | Guo Peng1,2 ; Ma Hong1 ; Guo Ruoshan1 ; Liu Zhuang1 ; Li Pin1 ; Wang Donglin1
|
| 出版日期 | 2018-08 |
| 会议日期 | 2018-10 |
| 会议地点 | 北京 |
| 英文摘要 | Convolutional neural networks (CNNs) have achieved outstanding performance in many domains. However, the stateof-the-art CNN models also introduce massive computation and huge memory footprint. To facilitate the deployment of CNN on embedded platforms, many existing studies focus on designing dedicated hardware accelerators. But there still exists many legacy DSP-based platforms which can also be exploited to accelerate the inference of CNN. In this work, we study the computation of CNN on MaPU, which is a customized VLIW DSP. MaPU is empowered with a multi-granularity parallel memory system and a flexible program model, which is very suitable for compute-intensive tasks. Through an in-depth analysis of CNN’s parallelism and the hardware architecture, we propose a kernel-expanded scheduling scheme, which can handle different kernel size uniformly. Based on our experiment on a face recognition network, MaPU achieves great performance and power efficiency. |
| 语种 | 英语 |
| 源URL | [http://ir.ia.ac.cn/handle/173211/23879] ![]() |
| 专题 | 自动化研究所_国家专用集成电路设计工程技术研究中心 |
| 通讯作者 | Guo Peng |
| 作者单位 | 1.中科院自动化研究所 2.中国科学院大学 |
| 推荐引用方式 GB/T 7714 | Guo Peng,Ma Hong,Guo Ruoshan,et al. Accelerate Convolutional Neural Network with a customized VLIW DSP[C]. 见:. 北京. 2018-10. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。

