中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Minimum spanning tree based classification model for massive data with MapReduce implementation

文献类型:会议论文

作者Jin Chang; Jun Luo; Joshua Zhexue Huang; Shengzhong Feng; Jianping Fan
出版日期2010
会议名称10th IEEE International Conference on Data Mining Workshops, ICDMW 2010
英文摘要Rapid growth of data has provided us with more information, yet challenges the tradition techniques to extract the useful knowledge. In this paper, we propose MCMM, a Minimum spanning tree (MST) based Classification model for Massive data with MapReduce implementation. It can be viewed as an intermediate model between the traditional K nearest neighbor method and cluster based classification method, aiming to overcome their disadvantages and cope with large amount of data. Our model is implemented on Hadoop platform, using its MapReduce programming framework, which is particular suitable for cloud computing. We have done experiments on several data sets including real world data from UCI repository and synthetic data, using Downing 4000 clusters, installed with Hadoop. The results show that our model outperforms KNN and some other classification methods on a general basis with respect to accuracy and scalability
收录类别EI
语种英语
源URL[http://ir.siat.ac.cn:8080/handle/172644/3118]  
专题深圳先进技术研究院_数字所
作者单位2010
推荐引用方式
GB/T 7714
Jin Chang,Jun Luo,Joshua Zhexue Huang,et al. Minimum spanning tree based classification model for massive data with MapReduce implementation[C]. 见:10th IEEE International Conference on Data Mining Workshops, ICDMW 2010.

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。