中国科学院机构知识库网格系统: An Application-oblivious Memory Scheduling System for DNN Accelerators

An Application-oblivious Memory Scheduling System for DNN Accelerators

文献类型：期刊论文


作者	Li, Jiansong 5; Wang, Xueying 3,4; Chen, Xiaobing 3,4; Li, Guangli 3,4; Dong, Xiao 2; Zhao, Peng 1; Yu, Xianzhi 1; Yang, Yongxin 3,4; Cao, Wei 3,4; Liu, Lei 3,4
刊名	ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
出版日期	2022-12-01
卷号	19 期号:4 页码:26
关键词	Deep learning memory scheduling runtime system DNN accelerators
ISSN号	1544-3566
DOI	10.1145/3535355
英文摘要	Deep Neural Networks (DNNs) tend to go deeper and wider, which poses a significant challenge to the training of DNNs, due to the limited memory capacity of DNN accelerators. Existing solutions for memory-efficient DNN training are densely coupled with the application features of DNN workloads, e.g., layer structures or computational graphs of DNNs are necessary for these solutions. This would result in weak versatility for DNNs with sophisticated layer structures or complicated computation graphs. These schemes usually need to be re-implemented or re-adapted due to the new layer structures or the unusual operators in the computational graphs introduced by these DNNs. In this article, we review the memory pressure issues of DNN training from the perspective of runtime systems and model the memory access behaviors of DNN workloads. We identify the iterative, regularity, and extremalization properties of memory access patterns for DNN workloads. Based on these observations, we propose AppObMem, an application-oblivious memory scheduling system. AppObMem automatically traces the memory behaviors of DNN workloads and schedules the memory swapping to reduce the memory pressure of the device accelerators without the perception of high-level information of layer structures or computation graphs. Evaluations on a variety ofDNNmodels showthat, AppObMem obtains 40-60% memory savings with acceptable performance loss. AppObMem is also competitive with other open sourced SOTA schemes.
资助项目	National Natural Science Foundation of China[61872043]
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000893255000001
出版者	ASSOC COMPUTING MACHINERY
源URL	[http://119.78.100.204/handle/2XEOYT63/20222]
专题	中国科学院计算技术研究所期刊论文
通讯作者	Li, Guangli
作者单位	1.Huawei 2012 Lab, Beijing, Peoples R China 2.NVIDIA Corp, Shanghai, Peoples R China 3.Univ Chinese Acad Sci, Sch Comp Sci & Technol, 19 A Yuquan Rd, Beijing 100049, Peoples R China 4.Chinese Acad Sci, State Key Lab Comp Architecture, Inst Comp Technol, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China 5.Huawei Galois Lab, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Li, Jiansong,Wang, Xueying,Chen, Xiaobing,et al. An Application-oblivious Memory Scheduling System for DNN Accelerators[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2022,19(4):26.
APA	Li, Jiansong.,Wang, Xueying.,Chen, Xiaobing.,Li, Guangli.,Dong, Xiao.,...&Feng, Xiaobing.(2022).An Application-oblivious Memory Scheduling System for DNN Accelerators.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,19(4),26.
MLA	Li, Jiansong,et al."An Application-oblivious Memory Scheduling System for DNN Accelerators".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 19.4(2022):26.

入库方式： OAI收割

来源：计算技术研究所

下载0

An Application-oblivious Memory Scheduling System for DNN Accelerators

其他版本