中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Grid-based DBSCAN: Indexing and inference

文献类型:期刊论文

作者Zhuang, Fuzhen3,4; He, Qing3,4; Boonchoo, Thapana3,4; Ao, Xiang3,4; Liu, Yang3,4; Zhao, Weizhong1,2
刊名PATTERN RECOGNITION
出版日期2019-06-01
卷号90页码:271-284
关键词Density-based clustering Grid-based DBSCAN Union-find algorithm
ISSN号0031-3203
DOI10.1016/j.patcog.2019.01.034
英文摘要DBSCAN is one of clustering algorithms which can report arbitrarily-shaped clusters and noises without requiring the number of clusters as a parameter (unlike the other clustering algorithms, k-means, for example). Because the running time of DBSCAN has quadratic order of growth, i.e. O(n(2)), research studies on improving its performance have been received a considerable amount of attention for decades. Grid-based DBSCAN is a well-developed algorithm whose complexity is improved to O(nlog n) in 2D space, while requiring Omega(n(4/3)) to solve when dimension >= 3. However, we find that Grid-based DBSCAN suffers from two problems: neighbour explosion and redundancies in merging, which make the algorithms infeasible in high dimensional space. In this paper we first propose a novel algorithm called GDCF which utilizes bitmap indexing to support efficient neighbour grid queries. Second, based on the concept of union-find algorithm we devise a forest-like structure, called cluster forest, to alleviate the redundancies in the merging. Moreover, we find that running the cluster forest in different orders can lead to a different number of merging operations needed to perform in the merging step. We propose to perform the merging step in a uniform random order to optimize the number of merging operations. However, for high-density database, a bottleneck could be occurred, we further propose a low-density-first order to alleviate this bottleneck. The experiments resulted on both real-world and synthetic datasets demonstrate that the proposed algorithm outperforms the state-of-the-art exact/approximate DBSCAN and suggests a good scalability. (C) 2019 Elsevier Ltd. All rights reserved.
资助项目National Key Research and Development Program of China[2017YFB1002104] ; National Natural Science Foundation of China[U1811461] ; National Natural Science Foundation of China[61602438] ; National Natural Science Foundation of China[91846113] ; National Natural Science Foundation of China[61573335] ; CCF-Tencent Rhino-Bird Young Faculty Open Research Fund[RAGR20180111] ; Ant Financial through the Ant Financial Science Funds for Security Research
WOS研究方向Computer Science ; Engineering
语种英语
WOS记录号WOS:000463130400023
出版者ELSEVIER SCI LTD
源URL[http://119.78.100.204/handle/2XEOYT63/4149]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Ao, Xiang
作者单位1.Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lea, Wuhan, Hubei, Peoples R China
2.Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China
3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Zhuang, Fuzhen,He, Qing,Boonchoo, Thapana,et al. Grid-based DBSCAN: Indexing and inference[J]. PATTERN RECOGNITION,2019,90:271-284.
APA Zhuang, Fuzhen,He, Qing,Boonchoo, Thapana,Ao, Xiang,Liu, Yang,&Zhao, Weizhong.(2019).Grid-based DBSCAN: Indexing and inference.PATTERN RECOGNITION,90,271-284.
MLA Zhuang, Fuzhen,et al."Grid-based DBSCAN: Indexing and inference".PATTERN RECOGNITION 90(2019):271-284.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。