Minimum spanning tree based classification model for massive data with MapReduce implementation
文献类型:会议论文
作者 | Jin Chang; Jun Luo; Joshua Zhexue Huang; Shengzhong Feng; Jianping Fan |
出版日期 | 2010 |
会议名称 | 10th IEEE International Conference on Data Mining Workshops, ICDMW 2010 |
英文摘要 | Rapid growth of data has provided us with more information, yet challenges the tradition techniques to extract the useful knowledge. In this paper, we propose MCMM, a Minimum spanning tree (MST) based Classification model for Massive data with MapReduce implementation. It can be viewed as an intermediate model between the traditional K nearest neighbor method and cluster based classification method, aiming to overcome their disadvantages and cope with large amount of data. Our model is implemented on Hadoop platform, using its MapReduce programming framework, which is particular suitable for cloud computing. We have done experiments on several data sets including real world data from UCI repository and synthetic data, using Downing 4000 clusters, installed with Hadoop. The results show that our model outperforms KNN and some other classification methods on a general basis with respect to accuracy and scalability |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/3118] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | 2010 |
推荐引用方式 GB/T 7714 | Jin Chang,Jun Luo,Joshua Zhexue Huang,et al. Minimum spanning tree based classification model for massive data with MapReduce implementation[C]. 见:10th IEEE International Conference on Data Mining Workshops, ICDMW 2010. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。