中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
approximate pairwise clustering for large data sets via sampling plus extension

文献类型:期刊论文

作者Wang Liang ; Leckie Christopher ; Kotagiri Ramamohanarao ; Bezdek James
刊名Pattern Recognition
出版日期2011
卷号44期号:2页码:222-235
ISSN号0031-3203
中文摘要Pairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a "sampling, clustering plus extension" strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method. © 2010 Elsevier Ltd.
英文摘要Pairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a "sampling, clustering plus extension" strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method. © 2010 Elsevier Ltd.
收录类别EI
语种英语
公开日期2013-10-08
源URL[http://ir.iscas.ac.cn/handle/311060/16176]  
专题软件研究所_软件所图书馆_期刊论文
推荐引用方式
GB/T 7714
Wang Liang,Leckie Christopher,Kotagiri Ramamohanarao,et al. approximate pairwise clustering for large data sets via sampling plus extension[J]. Pattern Recognition,2011,44(2):222-235.
APA Wang Liang,Leckie Christopher,Kotagiri Ramamohanarao,&Bezdek James.(2011).approximate pairwise clustering for large data sets via sampling plus extension.Pattern Recognition,44(2),222-235.
MLA Wang Liang,et al."approximate pairwise clustering for large data sets via sampling plus extension".Pattern Recognition 44.2(2011):222-235.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。