Smart Shuffling in MapReduce: A Solution to Balance Network Traffic and Workloads
文献类型:会议论文
作者 | Wei Shi; Yang Wang; Boqiang Niu; William Lee Croft; Mengfei Peng |
出版日期 | 2015 |
会议名称 | IEEE/ACM UCC |
会议地点 | Cyprus |
英文摘要 | Abstract—In the context of Hadoop, recent studies show that the shuffle operation accounts for as much as a third of the completion time of a MapReduce job. Consequently, the shuffle phase constitutes a crucial aspect of the scheduling of such jobs. During a shuffle phase, the job scheduler assigns reduce tasks to a set of reduce nodes. This may require multiple intermediate data items which share a key to be relocated to this new set of reduce nodes. In turn, this could cause a large volume of simultaneous data relocations within the network. Intuitively, a reduce task experiences shorter access latency if its required items are available locally or in close proximity. This, however, may also result in a hotspot in the network due to imbalanced traffic, as well as the imbalance of the workload on different nodes, regardless of their homogeneity. In this paper, we study data relocation incurred during the shuffle stage in the MapReduce framework. Within an arbitrary network, we aim at a) minimizing the overall network traffic, b) achieving workload balancing, and c) eliminating network hotspots, in order to improve the overall performance. Our contribution consists of the development of a scheduler that satisfies these three goals. We then present an in-depth simulation. Our results show that, for arbitrary network topologies, our Smart Shuffling Scheduler systematically outperforms the CoGRS scheduler in terms of hotspot elimination as well as reduce task load balancing, while ensuring traffic caused by data relocation is low. Furthermore, our algorithm is able to handle networks of any topology. In particular, for the tree topology commonly used within data centres, our proposed scheduler offers significant improvements over the CoGRS scheduler. |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/7001] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | 2015 |
推荐引用方式 GB/T 7714 | Wei Shi,Yang Wang,Boqiang Niu,et al. Smart Shuffling in MapReduce: A Solution to Balance Network Traffic and Workloads[C]. 见:IEEE/ACM UCC. Cyprus. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。