Parallel Incremental Frequent Itemset Mining for Large Data
文献类型:期刊论文
作者 | Song, Yu-Geng1,2; Cui, Hui-Min1,2; Feng, Xiao-Bing1 |
刊名 | JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
![]() |
出版日期 | 2017-03-01 |
卷号 | 32期号:2页码:368-385 |
关键词 | incremental parallel FPGrowth data mining frequent itemset mining MapReduce |
ISSN号 | 1000-9000 |
DOI | 10.1007/s11390-017-1726-y |
英文摘要 | Frequent itemset mining (FIM) is a popular data mining issue adopted in many fields, such as commodity recommendation in the retail industry, log analysis in web searching, and query recommendation (or related search). A large number of FIM algorithms have been proposed to obtain better performance, including parallelized algorithms for processing large data volumes. Besides, incremental FIM algorithms are also proposed to deal with incremental database updates. However, most of these incremental algorithms have low parallelism, causing low efficiency on huge databases. This paper presents two parallel incremental FIM algorithms called IncMiningPFP and IncBuildingPFP, implemented on the MapReduce framework. IncMiningPFP preserves the FP-tree mining results of the original pass, and utilizes them for incremental calculations. In particular, we propose a method to generate a partial FP-tree in the incremental pass, in order to avoid unnecessary mining work. Further, some of the incremental parallel tasks can be omitted when the inserted transactions include fewer items. IncbuildingPFP preserves the CanTrees built in the original pass, and then adds new transactions to them during the incremental passes. Our experimental results show that IncMiningPFP can achieve significant speedup over PFP (Parallel FPGrowth) and a sequential incremental algorithm (CanTree) in most cases of incremental input database, and in other cases IncBuildingPFP can achieve it. |
资助项目 | National High Technology Research and Development 863 Program of China[2015AA011505] ; National High Technology Research and Development 863 Program of China[2015AA015306] ; National High Technology Research and Development 863 Program of China[2012AA010902] ; National Natural Science Foundation of China[61202055] ; National Natural Science Foundation of China[61221062] ; National Natural Science Foundation of China[61521092] ; National Natural Science Foundation of China[61303053] ; National Natural Science Foundation of China[61432016] ; National Natural Science Foundation of China[61402445] ; National Natural Science Foundation of China[61672492] ; National Key Research and Development Program of China[2016YFB1000402] |
WOS研究方向 | Computer Science |
语种 | 英语 |
WOS记录号 | WOS:000397835500014 |
出版者 | SCIENCE PRESS |
源URL | [http://119.78.100.204/handle/2XEOYT63/7411] ![]() |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Song, Yu-Geng |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China |
推荐引用方式 GB/T 7714 | Song, Yu-Geng,Cui, Hui-Min,Feng, Xiao-Bing. Parallel Incremental Frequent Itemset Mining for Large Data[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2017,32(2):368-385. |
APA | Song, Yu-Geng,Cui, Hui-Min,&Feng, Xiao-Bing.(2017).Parallel Incremental Frequent Itemset Mining for Large Data.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,32(2),368-385. |
MLA | Song, Yu-Geng,et al."Parallel Incremental Frequent Itemset Mining for Large Data".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32.2(2017):368-385. |
入库方式: OAI收割
来源:计算技术研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。