中国科学院机构知识库网格系统: MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading

MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading

文献类型：期刊论文


作者	Yang, Huan 2,3; Sun, Sheng 2; Liu, Min 1,2,3; Zhang, Qiuping 2,3; Wang, Yuwei 2
刊名	COMPUTER NETWORKS
出版日期	2023-07-01
卷号	231 页码:17
关键词	DNN inference Model uploading DNN partitioning Resource allocation
ISSN号	1389-1286
DOI	10.1016/j.comnet.2023.109801
英文摘要	As an emerging computing paradigm, edge computing can assist user equipments (UEs) in executing computation-intensive deep neural network (DNN) inference tasks, thereby satisfying the stringent QoS requirement and relieving the burden of UEs. Due to the customizability of DNN models and limited capacity of the edge server, it is more realistic to upload DNN models on demand during end-to-edge co-inference, instead of deploying all DNN models at the edge server in advance. Existing works adopt the serial model uploading manner that uploads subsequent DNN layers only after antecedent DNN layers finish execution, inevitably prolonging the DNN execution latency. To this end, we innovatively design a parallel-efficient model uploading mechanism that allows subsequent DNN layers to be uploaded simultaneously when executing antecedent DNN layers, so as to efficiently mitigate the performance drop caused by model uploading. On this basis, we propose a Multi-UE Joint Optimization Algorithm based on Model Uploading (MJOA-MU) to optimize DNN partitioning and resource allocation for heterogeneous UEs. Specifically, MJOA-MU includes a Pruned Binary Tree based DNN Partitioning (PBT-DP) sub-algorithm to efficiently make the near-optimal partitioning decision for chain and non-chain models based on the long-term influence between DNN layers, and an Asynchronous Resource Allocation (ARA) sub-algorithm to allocate computation and communication resources for UEs by quantifying the inner-and inter-association, so as to match with individual demand and resource budget. Extensive simulation results demonstrate that MJOA-MU outperforms the state-of-the-art in terms of the DNN execution latency, and specifically achieves up to 64.5% reduction.
资助项目	National Key Research and Devel-opment Program of China[2021YFB2900102] ; National Natural Science Foundation of China[62072436] ; National Natural Science Foundation of China[62202449]
WOS研究方向	Computer Science ; Engineering ; Telecommunications
语种	英语
WOS记录号	WOS:001001503300001
出版者	ELSEVIER
源URL	[http://119.78.100.204/handle/2XEOYT63/21475]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Liu, Min
作者单位	1.Zhongguancun Lab, Beijing, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 3.Univ Chinese Acad Sci, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Yang, Huan,Sun, Sheng,Liu, Min,et al. MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading[J]. COMPUTER NETWORKS,2023,231:17.
APA	Yang, Huan,Sun, Sheng,Liu, Min,Zhang, Qiuping,&Wang, Yuwei.(2023).MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading.COMPUTER NETWORKS,231,17.
MLA	Yang, Huan,et al."MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading".COMPUTER NETWORKS 231(2023):17.

入库方式： OAI收割

来源：计算技术研究所

下载0

MJOA-MU: End-to-edge collaborative computation for DNN inference based on model uploading

其他版本