中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
热门
text clustering using frequent itemsets

文献类型:期刊论文

作者Zhang Wen ; Yoshida Taketoshi ; Tang Xijin ; Wang Qing
刊名Knowledge-based Systems
出版日期2010
卷号23期号:5页码:379-388
关键词Document clustering Frequent itemsets Maximum capturing Similarity measure Competitive learning SEQUENCES
通讯作者Zhang, W (通讯作者), Chinese Acad Sci, Inst Software, Lab Internet Software Technol, Beijing 100190, Peoples R China
收录类别SCI
WOS记录号WOS:000278881300002
公开日期2010-08-23
附注Frequent itemset originates from association rule mining. Recently, it has been applied in text mining such as document categorization, clustering, etc. In this paper, we conduct a study on text clustering using frequent itemsets. The main contribution of this paper is three manifolds. First, we present a review on existing methods of document clustering using frequent patterns. Second, a new method called Maximum Capturing is proposed for document clustering. Maximum Capturing includes two procedures: constructing document clusters and assigning cluster topics. We develop three versions of Maximum Capturing based on three similarity measures. We propose a normalization process based on frequency sensitive competitive learning for Maximum Capturing to merge cluster candidates into predefined number of clusters. Third, experiments are carried out to evaluate the proposed method in comparison with CFWS, CMS, FTC and FIHC methods. Experiment results show that in clustering, Maximum Capturing has better performances than other methods mentioned above. Particularly, Maximum Capturing with representation using individual words and similarity measure using asymmetrical binary similarity achieves the best performance. Moreover, topics produced by Maximum Capturing distinguished clusters from each other and can be used as labels of document clusters. (C) 2010 Elsevier B.V. All rights reserved.
源URL[http://124.16.136.157/handle/311060/3910]  
专题软件研究所_互联网软件技术实验室 _期刊论文
推荐引用方式
GB/T 7714
Zhang Wen,Yoshida Taketoshi,Tang Xijin,et al. text clustering using frequent itemsets[J]. Knowledge-based Systems,2010,23(5):379-388.
APA Zhang Wen,Yoshida Taketoshi,Tang Xijin,&Wang Qing.(2010).text clustering using frequent itemsets.Knowledge-based Systems,23(5),379-388.
MLA Zhang Wen,et al."text clustering using frequent itemsets".Knowledge-based Systems 23.5(2010):379-388.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。