Scout Sketch+: Finding Both Promising and Damping Items Simultaneously in Data Streams

IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Guoju Gao;Tianyu Ma;He Huang;Yu-E Sun;Haibo Wang;Yang Du;Shigang Chen
{"title":"Scout Sketch+: Finding Both Promising and Damping Items Simultaneously in Data Streams","authors":"Guoju Gao;Tianyu Ma;He Huang;Yu-E Sun;Haibo Wang;Yang Du;Shigang Chen","doi":"10.1109/TNET.2024.3469196","DOIUrl":null,"url":null,"abstract":"Data stream processing holds great potential value in lots of practical application scenarios. This paper studies two new but important patterns for items in data streams, called promising and damping items. The promising items mean that the frequencies of an item in multiple continuous time windows show an upward trend overall, while a slight decrease in some of these windows is allowed. In contrast to promising items exhibiting an increasing trend, the definition of damping items indicates a decreasing trend. Many applications can benefit from the property of promising or damping items, e.g., monitoring latent attacks in computer networks, pre-adjusting bandwidth allocation in communication channels, detecting potential hot events/news, or finding topics that gradually lose momentum in social networks. We first introduce how to accurately find promising items in data streams in real-time under limited memory space. To this end, we propose a novel structure named Scout Sketch, which consists of Filter and Finder. Filter is devised based on the Bloom filter to eliminate the ungratified items with less memory overload; Finder records some necessary information about the potential items and detects the promising items at the end of each time window, where we propose some tailor-made detection operations. We then enhance Scout Sketch (called Scout Sketch+) to adaptively detect both types of promising and damping items simultaneously. Finally, we conducted extensive experiments on four real-world datasets, which show that the F1 Score and throughput of Scout Sketch(+) are about 2.02 and 5.61 times that of the compared solutions. All source codes are available at Github (\n<uri>https://github.com/Aoohhh/ScoutSketch</uri>\n).","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 6","pages":"5491-5506"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10705120/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Data stream processing holds great potential value in lots of practical application scenarios. This paper studies two new but important patterns for items in data streams, called promising and damping items. The promising items mean that the frequencies of an item in multiple continuous time windows show an upward trend overall, while a slight decrease in some of these windows is allowed. In contrast to promising items exhibiting an increasing trend, the definition of damping items indicates a decreasing trend. Many applications can benefit from the property of promising or damping items, e.g., monitoring latent attacks in computer networks, pre-adjusting bandwidth allocation in communication channels, detecting potential hot events/news, or finding topics that gradually lose momentum in social networks. We first introduce how to accurately find promising items in data streams in real-time under limited memory space. To this end, we propose a novel structure named Scout Sketch, which consists of Filter and Finder. Filter is devised based on the Bloom filter to eliminate the ungratified items with less memory overload; Finder records some necessary information about the potential items and detects the promising items at the end of each time window, where we propose some tailor-made detection operations. We then enhance Scout Sketch (called Scout Sketch+) to adaptively detect both types of promising and damping items simultaneously. Finally, we conducted extensive experiments on four real-world datasets, which show that the F1 Score and throughput of Scout Sketch(+) are about 2.02 and 5.61 times that of the compared solutions. All source codes are available at Github ( https://github.com/Aoohhh/ScoutSketch ).
侦察兵草图+:在数据流中同时发现有希望和抑制的项目
数据流处理在许多实际应用场景中具有巨大的潜在价值。本文研究了数据流中两种新的但重要的项模式,即有希望项和阻尼项。有希望的项目意味着一个项目的频率在多个连续时间窗口中总体上呈上升趋势,而在某些窗口中略有下降是允许的。与有希望项目的定义呈增加趋势相反,阻尼项目的定义呈减少趋势。许多应用程序可以受益于有前途或阻尼项目的属性,例如监控计算机网络中的潜在攻击,预先调整通信通道中的带宽分配,检测潜在的热点事件/新闻,或发现社交网络中逐渐失去动力的话题。我们首先介绍了如何在有限的内存空间下实时准确地找到数据流中有希望的项。为此,我们提出了一种新的结构——Scout Sketch,它由Filter和Finder组成。过滤器是在布隆过滤器的基础上设计的,以消除不满意的项目,内存过载较小;Finder记录潜在项目的一些必要信息,并在每个时间窗口结束时检测有希望的项目,我们提出一些量身定制的检测操作。然后,我们增强Scout Sketch(称为Scout Sketch+),以自适应地同时检测两种类型的有希望和阻尼项目。最后,我们在四个真实数据集上进行了广泛的实验,结果表明Scout Sketch(+)的F1 Score和吞吐量分别是比较方案的2.02倍和5.61倍。所有源代码可在Github (https://github.com/Aoohhh/ScoutSketch)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking 工程技术-电信学
CiteScore
8.20
自引率
5.40%
发文量
246
审稿时长
4-8 weeks
期刊介绍: The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信