中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
MIA: Metric Importance Analysis for Big Data Workload Characterization

文献类型:期刊论文

作者Zhibin Yu; Wen Xiong; Lieven Eeckhout; Zhengdong Bei; Avi Mendelson; Chengzhong Xu
刊名IEEE Transactions on Parallel and Distributed Systems (TPDS)
出版日期2017
文献子类期刊论文
英文摘要Data analytics is at the foundation of both high-quality products and services in modern economies and societies. Big data workloads run on complex large-scale computing clusters, which implies significant challenges for deeply understanding and characterizing overall system performance. In general, performance is affected by many factors at multiple layers in the system stack, hence it is challenging to identify the key metrics when understanding big data workload performance. In this paper, we propose a novel workload characterization methodology using ensemble learning, called Metric Importance Analysis (MIA), to quantify the respective importance of workload metrics. By focusing on the most important metrics, MIA reduces the complexity of the analysis without losing information. Moreover, we develop the MIA-based Kiviat Plot (MKP) and Benchmark Similarity Matrix (BSM) which provide more insightful information than the traditional linkage clustering based dendrogram to visualize program behavior (dis)similarity. To demonstrate the applicability of MIA, we use it to characterize three big data benchmark suites: HiBench, CloudRank-D and SZTS. The results show that MIA is able to characterize complex big data workloads in a simple, intuitive manner, and reveal interesting insights. Moreover, through a case study, we demonstrate that tuning the configuration parameters related to the important metrics found by MIA results in higher performance improvements than through tuning the parameters related to the less important ones.
URL标识查看原文
语种英语
源URL[http://ir.siat.ac.cn:8080/handle/172644/12535]  
专题深圳先进技术研究院_数字所
作者单位IEEE Transactions on Parallel and Distributed Systems (TPDS)
推荐引用方式
GB/T 7714
Zhibin Yu,Wen Xiong,Lieven Eeckhout,et al. MIA: Metric Importance Analysis for Big Data Workload Characterization[J]. IEEE Transactions on Parallel and Distributed Systems (TPDS),2017.
APA Zhibin Yu,Wen Xiong,Lieven Eeckhout,Zhengdong Bei,Avi Mendelson,&Chengzhong Xu.(2017).MIA: Metric Importance Analysis for Big Data Workload Characterization.IEEE Transactions on Parallel and Distributed Systems (TPDS).
MLA Zhibin Yu,et al."MIA: Metric Importance Analysis for Big Data Workload Characterization".IEEE Transactions on Parallel and Distributed Systems (TPDS) (2017).

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。