中国科学院机构知识库网格系统: DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update

DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update

文献类型：期刊论文


作者	Peng, Xiao-Hui 1,2; Sun, Yi-Xuan 2; Zhang, Zheng-Hui 1; Wang, Yi-Fan 1,2
刊名	JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
出版日期	2025-05-01
卷号	40 期号:3 页码:637-653
关键词	distributed training edge computing edge machine learning sparse update edge cluster
ISSN号	1000-9000
DOI	10.1007/s11390-025-4821-5
英文摘要	Edge machine learning creates a new computational paradigm by enabling the deployment of intelligent applications at the network edge. It enhances application efficiency and responsiveness by performing inference and training tasks closer to data sources. However, it encounters several challenges in practice. The variance in hardware specifications and performance across different devices presents a major issue for the training and inference tasks. Additionally, edge devices typically possess limited network bandwidth and computing resources compared with data centers. Moreover, existing distributed training architectures often fail to consider the constraints of resources and communication efficiency in edge environments. In this paper, we propose DSparse, a method for distributed training based on sparse update in edge clusters with various memory capacities. It aims at maximizing the utilization of memory resources across all devices within a cluster. To reduce memory consumption during the training process, we adopt sparse update to prioritize the updating of selected layers on the devices in the cluster, which not only lowers memory usage but also reduces the data volume of parameters and the time required for parameter aggregation. Furthermore, DSparse utilizes a parameter aggregation mechanism based on multi-process groups, subdividing the aggregation tasks into AllReduce and Broadcast types, thereby further reducing the communication frequency for parameter aggregation. Experimental results using the MobileNetV2 model on the CIFAR-10 dataset demonstrate that DSparse reduces memory consumption by an average of 59.6% across seven devices, with a 75.4% reduction in parameter aggregation time, while maintaining model precision.
资助项目	National Natural Science Foundation of China[62072434] ; National Natural Science Foundation of China[U23B2004] ; Innovation Funding of Institute of Computing Technology, Chinese Academy of Sciences[E361050] ; Innovation Funding of Institute of Computing Technology, Chinese Academy of Sciences[E361030]
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:001529691300017
出版者	SPRINGER SINGAPORE PTE LTD
源URL	[http://119.78.100.204/handle/2XEOYT63/42068]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Wang, Yi-Fan
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 101408, Peoples R China
推荐引用方式 GB/T 7714	Peng, Xiao-Hui,Sun, Yi-Xuan,Zhang, Zheng-Hui,et al. DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2025,40(3):637-653.
APA	Peng, Xiao-Hui,Sun, Yi-Xuan,Zhang, Zheng-Hui,&Wang, Yi-Fan.(2025).DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,40(3),637-653.
MLA	Peng, Xiao-Hui,et al."DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 40.3(2025):637-653.

入库方式： OAI收割

来源：计算技术研究所

下载0

DSparse: A Distributed Training Method for Edge Clusters Based on Sparse Update

其他版本