中国科学院机构知识库网格系统: moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units

moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units

文献类型：期刊论文


作者	Hu, Xiaobo Sharon 1; Han, Yinhe 2; Chen, Danny Ziyi 1; Chen, Xiaoming 2
刊名	IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
出版日期	2019-03-01
卷号	30 期号:3 页码:646-661
关键词	Deep neural networks graphics processing units memory usage
ISSN号	1045-9219
DOI	10.1109/TPDS.2018.2866582
英文摘要	Graphics processing units (GPUs) have been widely adopted to accelerate the training of deep neural networks (DNNs). Although the computational performance of GPUs has been improving steadily, the memory size of modern GPUs is still quite limited, which restricts the sizes of DNNs that can be trained on GPUs, and hence raises serious challenges. This paper introduces a framework, referred to as moDNN (memory optimal DNN training on GPUs), to optimize the memory usage in DNN training. moDNN supports automatic tuning of DNN training code to match any given memory budget (not smaller than the theoretical lower bound). By taking full advantage of overlapping computations and data transfers, we develop new heuristics to judiciously schedule data offloading and prefetching transfers, together with convolution algorithm selection, to optimize memory usage. We further devise a new sub-batch size selection method which also greatly reduces memory usage. moDNN can save memory usage up to 59x, compared with an ideal case which assumes that the GPU memory is sufficient to hold all data. When executing moDNN on a GPU with 12 GB memory, the training time is increased by only 3 percent, which is much shorter than that incurred by the best known approach, vDNN. Furthermore, we propose an optimization strategy for moDNN on multiple GPUs again by utilizing the idea of overlapping data transfers and GPU computations. The results show that 3.7x speedup is attained on four GPUs.
资助项目	National Science Foundation (NSF)[CCF-1217906] ; National Science Foundation (NSF)[CNS-1629914] ; National Science Foundation (NSF)[CCF-1617735] ; National Science Foundation (NSF)[CCF-1640081] ; Nanoelectronics Research Corporation (NERC) of the Semiconductor Research Corporation (SRC), through Extremely Energy Efficient Collective Electronics (EXCEL), an SRC-NRI Nanoelectronics Research Initiative[2698.004] ; Nanoelectronics Research Corporation (NERC) of the Semiconductor Research Corporation (SRC), through Extremely Energy Efficient Collective Electronics (EXCEL), an SRC-NRI Nanoelectronics Research Initiative[2698.005]
WOS研究方向	Computer Science ; Engineering
语种	英语
WOS记录号	WOS:000458820700012
出版者	IEEE COMPUTER SOC
源URL	[http://119.78.100.204/handle/2XEOYT63/3412]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Chen, Xiaoming
作者单位	1.Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA 2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Hu, Xiaobo Sharon,Han, Yinhe,Chen, Danny Ziyi,et al. moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,2019,30(3):646-661.
APA	Hu, Xiaobo Sharon,Han, Yinhe,Chen, Danny Ziyi,&Chen, Xiaoming.(2019).moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units.IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,30(3),646-661.
MLA	Hu, Xiaobo Sharon,et al."moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units".IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 30.3(2019):646-661.

入库方式： OAI收割

来源：计算技术研究所

下载0

moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units

其他版本