中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Smart Shuffling in MapReduce: A Solution to Balance Network Traffic and Workloads

文献类型:会议论文

作者Wei Shi; Yang Wang; Boqiang Niu; William Lee Croft; Mengfei Peng
出版日期2015
会议名称IEEE/ACM UCC
会议地点Cyprus
英文摘要Abstract—In the context of Hadoop, recent studies show that the shuffle operation accounts for as much as a third of the completion time of a MapReduce job. Consequently, the shuffle phase constitutes a crucial aspect of the scheduling of such jobs. During a shuffle phase, the job scheduler assigns reduce tasks to a set of reduce nodes. This may require multiple intermediate data items which share a key to be relocated to this new set of reduce nodes. In turn, this could cause a large volume of simultaneous data relocations within the network. Intuitively, a reduce task experiences shorter access latency if its required items are available locally or in close proximity. This, however, may also result in a hotspot in the network due to imbalanced traffic, as well as the imbalance of the workload on different nodes, regardless of their homogeneity. In this paper, we study data relocation incurred during the shuffle stage in the MapReduce framework. Within an arbitrary network, we aim at a) minimizing the overall network traffic, b) achieving workload balancing, and c) eliminating network hotspots, in order to improve the overall performance. Our contribution consists of the development of a scheduler that satisfies these three goals. We then present an in-depth simulation. Our results show that, for arbitrary network topologies, our Smart Shuffling Scheduler systematically outperforms the CoGRS scheduler in terms of hotspot elimination as well as reduce task load balancing, while ensuring traffic caused by data relocation is low. Furthermore, our algorithm is able to handle networks of any topology. In particular, for the tree topology commonly used within data centres, our proposed scheduler offers significant improvements over the CoGRS scheduler.
收录类别EI
语种英语
源URL[http://ir.siat.ac.cn:8080/handle/172644/7001]  
专题深圳先进技术研究院_数字所
作者单位2015
推荐引用方式
GB/T 7714
Wei Shi,Yang Wang,Boqiang Niu,et al. Smart Shuffling in MapReduce: A Solution to Balance Network Traffic and Workloads[C]. 见:IEEE/ACM UCC. Cyprus.

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。