Grid-based DBSCAN: Indexing and inference
文献类型:期刊论文
作者 | Zhuang, Fuzhen3,4; He, Qing3,4; Boonchoo, Thapana3,4; Ao, Xiang3,4; Liu, Yang3,4; Zhao, Weizhong1,2 |
刊名 | PATTERN RECOGNITION
![]() |
出版日期 | 2019-06-01 |
卷号 | 90页码:271-284 |
关键词 | Density-based clustering Grid-based DBSCAN Union-find algorithm |
ISSN号 | 0031-3203 |
DOI | 10.1016/j.patcog.2019.01.034 |
英文摘要 | DBSCAN is one of clustering algorithms which can report arbitrarily-shaped clusters and noises without requiring the number of clusters as a parameter (unlike the other clustering algorithms, k-means, for example). Because the running time of DBSCAN has quadratic order of growth, i.e. O(n(2)), research studies on improving its performance have been received a considerable amount of attention for decades. Grid-based DBSCAN is a well-developed algorithm whose complexity is improved to O(nlog n) in 2D space, while requiring Omega(n(4/3)) to solve when dimension >= 3. However, we find that Grid-based DBSCAN suffers from two problems: neighbour explosion and redundancies in merging, which make the algorithms infeasible in high dimensional space. In this paper we first propose a novel algorithm called GDCF which utilizes bitmap indexing to support efficient neighbour grid queries. Second, based on the concept of union-find algorithm we devise a forest-like structure, called cluster forest, to alleviate the redundancies in the merging. Moreover, we find that running the cluster forest in different orders can lead to a different number of merging operations needed to perform in the merging step. We propose to perform the merging step in a uniform random order to optimize the number of merging operations. However, for high-density database, a bottleneck could be occurred, we further propose a low-density-first order to alleviate this bottleneck. The experiments resulted on both real-world and synthetic datasets demonstrate that the proposed algorithm outperforms the state-of-the-art exact/approximate DBSCAN and suggests a good scalability. (C) 2019 Elsevier Ltd. All rights reserved. |
资助项目 | National Key Research and Development Program of China[2017YFB1002104] ; National Natural Science Foundation of China[U1811461] ; National Natural Science Foundation of China[61602438] ; National Natural Science Foundation of China[91846113] ; National Natural Science Foundation of China[61573335] ; CCF-Tencent Rhino-Bird Young Faculty Open Research Fund[RAGR20180111] ; Ant Financial through the Ant Financial Science Funds for Security Research |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:000463130400023 |
出版者 | ELSEVIER SCI LTD |
源URL | [http://119.78.100.204/handle/2XEOYT63/4149] ![]() |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Ao, Xiang |
作者单位 | 1.Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lea, Wuhan, Hubei, Peoples R China 2.Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 4.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Zhuang, Fuzhen,He, Qing,Boonchoo, Thapana,et al. Grid-based DBSCAN: Indexing and inference[J]. PATTERN RECOGNITION,2019,90:271-284. |
APA | Zhuang, Fuzhen,He, Qing,Boonchoo, Thapana,Ao, Xiang,Liu, Yang,&Zhao, Weizhong.(2019).Grid-based DBSCAN: Indexing and inference.PATTERN RECOGNITION,90,271-284. |
MLA | Zhuang, Fuzhen,et al."Grid-based DBSCAN: Indexing and inference".PATTERN RECOGNITION 90(2019):271-284. |
入库方式: OAI收割
来源:计算技术研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。