RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop’s Configuration
文献类型:期刊论文
作者 | Zhendong Bei; Zhibin Yu; Huiling Zhang; Wen Xiong; Chengzhong Xu; Lieven Eeckhout; Shengzhong Feng |
刊名 | IEEE Transactions on Parallel and Distributed Systems
![]() |
出版日期 | 2016 |
英文摘要 | Hadoop is a widely-used implementation framework of the MapReduce programming model for large-scale data processing. Hadoop performance however is significantly affected by the settings of the Hadoop configuration parameters. Unfortunately, manually tuning these parameters is very time-consuming, if at all practical. This paper proposes an approach, called RFHOC, to automatically tune the Hadoop configuration parameters for optimized performance for a given application running on a given cluster. RFHOC constructs two ensembles of performance models using a random-forest approach for the map and reduce stage respectively. Leveraging these models, RFHOC employs a genetic algorithm to automatically search the Hadoop configuration space. The evaluation of RFHOC using five typical Hadoop programs, each with five different input data sets, shows that it achieves a performance speedup by a factor of 2.11 on average and up to 7.4 over the recently proposed cost-based optimization (CBO) approach. In addition, RFHOC’s performance benefit increases with input data set size. |
收录类别 | SCI |
原文出处 | http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7132754 |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/10232] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | IEEE Transactions on Parallel and Distributed Systems |
推荐引用方式 GB/T 7714 | Zhendong Bei,Zhibin Yu,Huiling Zhang,et al. RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop’s Configuration[J]. IEEE Transactions on Parallel and Distributed Systems,2016. |
APA | Zhendong Bei.,Zhibin Yu.,Huiling Zhang.,Wen Xiong.,Chengzhong Xu.,...&Shengzhong Feng.(2016).RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop’s Configuration.IEEE Transactions on Parallel and Distributed Systems. |
MLA | Zhendong Bei,et al."RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop’s Configuration".IEEE Transactions on Parallel and Distributed Systems (2016). |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。