中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules

文献类型:期刊论文

作者Jia, CY; Gao, XP
刊名Journal of computer science and technology
出版日期2005-05-01
卷号20期号:3页码:309-318
关键词Data mining Association rule Frequent itemset Sample error Multi-scaling sampling
ISSN号1000-9000
通讯作者Jia, cy(jiacy@knowledge_science.ict.ac.cn)
英文摘要One of the obstacles of the efficient association rule mining is the explosive expansion of data sets since it is costly or impossible to scan large databases, esp., for multiple times. a popular solution to improve the speed and scalability of the association rule mining is to do the algorithm on a random sample instead of the entire database. but how to effectively define and efficiently estimate the degree of error with respect to the outcome of the algorithm, and how to determine the sample size needed are entangling researches until now. in this paper, an effective and efficient algorithm is given based on the pac (probably approximate correct) learning theory to measure and estimate sample error. then, a new adaptive, on-line, fast sampling strategy - multi-scaling sampling - is presented inspired by mra (multi-resolution analysis) and shannon sampling theorem, for quickly obtaining acceptably approximate association rules at appropriate sample size. both theoretical analysis and empirical study have showed that the sampling strategy can achieve a very good speed-accuracy trade-off.
WOS关键词ALGORITHM ; DATABASES ; PARALLEL
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Software Engineering
语种英语
WOS记录号WOS:000229292300003
出版者SCIENCE PRESS
URI标识http://www.irgrid.ac.cn/handle/1471x/2377258
专题中国科学院大学
通讯作者Jia, CY
作者单位1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
2.Chinese Acad Sci, Grad Sch, Beijing 100039, Peoples R China
3.Xiangtan Univ, Informat Engn Coll, Xiangtan 411105, Peoples R China
推荐引用方式
GB/T 7714
Jia, CY,Gao, XP. Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules[J]. Journal of computer science and technology,2005,20(3):309-318.
APA Jia, CY,&Gao, XP.(2005).Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules.Journal of computer science and technology,20(3),309-318.
MLA Jia, CY,et al."Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules".Journal of computer science and technology 20.3(2005):309-318.

入库方式: iSwitch采集

来源:中国科学院大学

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。