中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
SAHA: A String Adaptive Hash Table for Analytical Databases

文献类型:期刊论文

作者Zheng, Tianqi1,2; Zhang, Zhibin1; Cheng, Xueqi1,2
刊名APPLIED SCIENCES-BASEL
出版日期2020-03-01
卷号10期号:6页码:18
关键词hash table analytical database string data
DOI10.3390/app10061915
英文摘要Hash tables are the fundamental data structure for analytical database workloads, such as aggregation, joining, set filtering and records deduplication. The performance aspects of hash tables differ drastically with respect to what kind of data are being processed or how many inserts, lookups and deletes are constructed. In this paper, we address some common use cases of hash tables: aggregating and joining over arbitrary string data. We designed a new hash table, SAHA, which is tightly integrated with modern analytical databases and optimized for string data with the following advantages: (1) it inlines short strings and saves hash values for long strings only; (2) it uses special memory loading techniques to do quick dispatching and hashing computations; and (3) it utilizes vectorized processing to batch hashing operations. Our evaluation results reveal that SAHA outperforms state-of-the-art hash tables by one to five times in analytical workloads, including Google's SwissTable and Facebook's F14Table. It has been merged into the ClickHouse database and shows promising results in production.
资助项目Strategic Priority Research Program of the Chinese Academy of Sciences[XDA19020400]
WOS研究方向Chemistry ; Engineering ; Materials Science ; Physics
语种英语
WOS记录号WOS:000529252800016
出版者MDPI
源URL[http://119.78.100.204/handle/2XEOYT63/15029]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Zheng, Tianqi
作者单位1.Chinese Acad Sci, Inst Comp Technol, CAS Key Lab Network Data Sci & Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
推荐引用方式
GB/T 7714
Zheng, Tianqi,Zhang, Zhibin,Cheng, Xueqi. SAHA: A String Adaptive Hash Table for Analytical Databases[J]. APPLIED SCIENCES-BASEL,2020,10(6):18.
APA Zheng, Tianqi,Zhang, Zhibin,&Cheng, Xueqi.(2020).SAHA: A String Adaptive Hash Table for Analytical Databases.APPLIED SCIENCES-BASEL,10(6),18.
MLA Zheng, Tianqi,et al."SAHA: A String Adaptive Hash Table for Analytical Databases".APPLIED SCIENCES-BASEL 10.6(2020):18.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。