中国科学院机构知识库网格系统: Optimus: An Operator Fusion Framework for Deep Neural Networks

Optimus: An Operator Fusion Framework for Deep Neural Networks

文献类型：期刊论文


作者	Cai, Xuyi 3,4; Wang, Ying 1,2; Zhang, Lei 4
刊名	ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS
出版日期	2023
卷号	22 期号:1 页码:26
关键词	Neural network embedded processor memory layer fusion
ISSN号	1539-9087
DOI	10.1145/3520142
英文摘要	The reduction of neural parameters and operations for the applications on embedded and IoT platforms in current deep neural network (DNN) architectures has received increasing attention. Relatively, the intermediate feature maps of such lightweight neural networks begin to grow and usually outsize the on-chip memory as the new bottleneck, which introduces considerable power-consuming off-chip memory accesses. To reduce the feature-induced memory accesses, operator fusion has been proposed to parallelize the execution of multiple convolutional layers and shown significant reduction of off-chip memory accesses. However, how to fuse the neural operators is still a challenging issue that heavily depends on both the neural network (NN) topology and the specific DNN accelerator configuration. In this work, we observed prior operator fusion approaches fail to guarantee memory-level optimality as they search in the constrained operator fusion design space. Considering the complexity of the NN topologies and the constrained resources of the DNN accelerators, we develop a novel operator fusion framework, Optimus. Optimus includes an accurate memory cost model dedicated to the scheduler to evaluate the potential operator-fusion schemes and a directed acyclic graph-based operator fusion algorithm for both off-line and on-line workload deployment scenarios, which altogether generates high-efficiency operator-fusion solutions for arbitrary networkmodels running on DNN accelerators. The experimental results show that Optimus reduces 17-75% off-chip memory accesses and obtains 1.86x-3.66x energy efficiency on state-of-the-art DNN workloads when compared to the baselines and brings significant power-efficiency boost to the DNN accelerators of different architectures and dataflows.
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000908419900001
出版者	ASSOC COMPUTING MACHINERY
源URL	[http://119.78.100.204/handle/2XEOYT63/20029]
专题	中国科学院计算技术研究所期刊论文
通讯作者	Wang, Ying
作者单位	1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China 2.Chinese Acad Sci, Zhejiang Lab, Beijing, Peoples R China 3.Univ Chinese Acad Sci, Beijing, Peoples R China 4.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Cai, Xuyi,Wang, Ying,Zhang, Lei. Optimus: An Operator Fusion Framework for Deep Neural Networks[J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS,2023,22(1):26.
APA	Cai, Xuyi,Wang, Ying,&Zhang, Lei.(2023).Optimus: An Operator Fusion Framework for Deep Neural Networks.ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS,22(1),26.
MLA	Cai, Xuyi,et al."Optimus: An Operator Fusion Framework for Deep Neural Networks".ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS 22.1(2023):26.

入库方式： OAI收割

来源：计算技术研究所

下载0

Optimus: An Operator Fusion Framework for Deep Neural Networks

其他版本