approximate pairwise clustering for large data sets via sampling plus extension
文献类型:期刊论文
作者 | Wang Liang ; Leckie Christopher ; Kotagiri Ramamohanarao ; Bezdek James |
刊名 | Pattern Recognition
![]() |
出版日期 | 2011 |
卷号 | 44期号:2页码:222-235 |
ISSN号 | 0031-3203 |
中文摘要 | Pairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a "sampling, clustering plus extension" strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method. © 2010 Elsevier Ltd. |
英文摘要 | Pairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a "sampling, clustering plus extension" strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method. © 2010 Elsevier Ltd. |
收录类别 | EI |
语种 | 英语 |
公开日期 | 2013-10-08 |
源URL | [http://ir.iscas.ac.cn/handle/311060/16176] ![]() |
专题 | 软件研究所_软件所图书馆_期刊论文 |
推荐引用方式 GB/T 7714 | Wang Liang,Leckie Christopher,Kotagiri Ramamohanarao,et al. approximate pairwise clustering for large data sets via sampling plus extension[J]. Pattern Recognition,2011,44(2):222-235. |
APA | Wang Liang,Leckie Christopher,Kotagiri Ramamohanarao,&Bezdek James.(2011).approximate pairwise clustering for large data sets via sampling plus extension.Pattern Recognition,44(2),222-235. |
MLA | Wang Liang,et al."approximate pairwise clustering for large data sets via sampling plus extension".Pattern Recognition 44.2(2011):222-235. |
入库方式: OAI收割
来源:软件研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。