{"title":"Scout Sketch+: Finding Both Promising and Damping Items Simultaneously in Data Streams","authors":"Guoju Gao;Tianyu Ma;He Huang;Yu-E Sun;Haibo Wang;Yang Du;Shigang Chen","doi":"10.1109/TNET.2024.3469196","DOIUrl":null,"url":null,"abstract":"Data stream processing holds great potential value in lots of practical application scenarios. This paper studies two new but important patterns for items in data streams, called promising and damping items. The promising items mean that the frequencies of an item in multiple continuous time windows show an upward trend overall, while a slight decrease in some of these windows is allowed. In contrast to promising items exhibiting an increasing trend, the definition of damping items indicates a decreasing trend. Many applications can benefit from the property of promising or damping items, e.g., monitoring latent attacks in computer networks, pre-adjusting bandwidth allocation in communication channels, detecting potential hot events/news, or finding topics that gradually lose momentum in social networks. We first introduce how to accurately find promising items in data streams in real-time under limited memory space. To this end, we propose a novel structure named Scout Sketch, which consists of Filter and Finder. Filter is devised based on the Bloom filter to eliminate the ungratified items with less memory overload; Finder records some necessary information about the potential items and detects the promising items at the end of each time window, where we propose some tailor-made detection operations. We then enhance Scout Sketch (called Scout Sketch+) to adaptively detect both types of promising and damping items simultaneously. Finally, we conducted extensive experiments on four real-world datasets, which show that the F1 Score and throughput of Scout Sketch(+) are about 2.02 and 5.61 times that of the compared solutions. All source codes are available at Github (\n<uri>https://github.com/Aoohhh/ScoutSketch</uri>\n).","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 6","pages":"5491-5506"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10705120/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Data stream processing holds great potential value in lots of practical application scenarios. This paper studies two new but important patterns for items in data streams, called promising and damping items. The promising items mean that the frequencies of an item in multiple continuous time windows show an upward trend overall, while a slight decrease in some of these windows is allowed. In contrast to promising items exhibiting an increasing trend, the definition of damping items indicates a decreasing trend. Many applications can benefit from the property of promising or damping items, e.g., monitoring latent attacks in computer networks, pre-adjusting bandwidth allocation in communication channels, detecting potential hot events/news, or finding topics that gradually lose momentum in social networks. We first introduce how to accurately find promising items in data streams in real-time under limited memory space. To this end, we propose a novel structure named Scout Sketch, which consists of Filter and Finder. Filter is devised based on the Bloom filter to eliminate the ungratified items with less memory overload; Finder records some necessary information about the potential items and detects the promising items at the end of each time window, where we propose some tailor-made detection operations. We then enhance Scout Sketch (called Scout Sketch+) to adaptively detect both types of promising and damping items simultaneously. Finally, we conducted extensive experiments on four real-world datasets, which show that the F1 Score and throughput of Scout Sketch(+) are about 2.02 and 5.61 times that of the compared solutions. All source codes are available at Github (
https://github.com/Aoohhh/ScoutSketch
).
期刊介绍:
The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.