中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
PR-Sketch: Monitoring Per-key Aggregation of Streaming Data with Nearly Full Accuracy

文献类型:期刊论文

作者Sheng, Siyuan1; Huang, Qun2; Wang, Sa1; Bao, Yungang1
刊名PROCEEDINGS OF THE VLDB ENDOWMENT
出版日期2021-06-01
卷号14期号:10页码:1783-1796
ISSN号2150-8097
DOI10.14778/3467861.3467868
英文摘要Computing per-key aggregation is indispensable in streaming data analysis formulated as two phases, an update phase and a recovery phase. As the size and speed of data streams rise, accurate per-key information is useful in many applications like anomaly detection, attack prevention, and online diagnosis. Even though many algorithms have been proposed for per-key aggregation in stream processing, their accuracy guarantees only cover a small portion of keys. In this paper, we aim to achieve nearly full accuracy with limited resource usage. We follow the line of sketch-based techniques. We observe that existing methods suffer from high errors for most keys. The reason is that they track keys by complicated mechanism in the update phase and simply calculate per-key aggregation from some specific counter in the recovery phase. Therefore, we present PR-Sketch, a novel sketching design to address the two limitations. PR-Sketch builds linear equations between counter values and per-key aggregations to improve accuracy, and records keys in the recovery phase to reduce resource usage in the update phase. We also provide an extension called fast PR-Sketch to improve processing rate further. We derive space complexity, time complexity, and guaranteed error probability for both PR-Sketch and fast PR-Sketch. We conduct trace-driven experiments under 100K keys and 1M items to compare our algorithms with multiple state-of-the-art methods. Results demonstrate the resource efficiency and nearly full accuracy of our algorithms.
资助项目National Key R&D Program of China[2019YFB1802600] ; National Natural Science Foundation of China[61802365]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:000742888800008
出版者ASSOC COMPUTING MACHINERY
源URL[http://119.78.100.204/handle/2XEOYT63/18982]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Sheng, Siyuan
作者单位1.Univ Chinese Acad Sci, SKL Comp Architecture, ICT, CAS, Beijing, Peoples R China
2.Peking Univ, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Sheng, Siyuan,Huang, Qun,Wang, Sa,et al. PR-Sketch: Monitoring Per-key Aggregation of Streaming Data with Nearly Full Accuracy[J]. PROCEEDINGS OF THE VLDB ENDOWMENT,2021,14(10):1783-1796.
APA Sheng, Siyuan,Huang, Qun,Wang, Sa,&Bao, Yungang.(2021).PR-Sketch: Monitoring Per-key Aggregation of Streaming Data with Nearly Full Accuracy.PROCEEDINGS OF THE VLDB ENDOWMENT,14(10),1783-1796.
MLA Sheng, Siyuan,et al."PR-Sketch: Monitoring Per-key Aggregation of Streaming Data with Nearly Full Accuracy".PROCEEDINGS OF THE VLDB ENDOWMENT 14.10(2021):1783-1796.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。