Stefan Pedratscher , Zahra Najafabadi Samani , Juan Aznar Poveda , Thomas Fahringer , Marlon Etheredge , Abolfazl Younesi , Juan Jose Durillo Barrionuevo , Peter Thoman
{"title":"STREAMLINE: Dynamic and Resource-Efficient Auto-Tuning of Stream Processing Data Pipeline Ensembles","authors":"Stefan Pedratscher , Zahra Najafabadi Samani , Juan Aznar Poveda , Thomas Fahringer , Marlon Etheredge , Abolfazl Younesi , Juan Jose Durillo Barrionuevo , Peter Thoman","doi":"10.1016/j.iot.2025.101731","DOIUrl":null,"url":null,"abstract":"<div><div>With the growing volume of data generated by IoT devices and user-driven services, stream processing has become essential for handling continuous, real-time data. However, fluctuating workloads and the dynamic nature of data streams make it difficult to maintain consistent performance over time, requiring adaptive resource allocation and frequent configuration tuning. Running multiple data stream processing pipelines on shared resources further exacerbates the problem by increasing contention, leading to higher end-to-end latency and reduced performance stability. Most existing approaches focus on tuning individual configuration parameters in isolation and overlook interactions between concurrently running data pipelines. To address these limitations, we present STREAMLINE, a dynamic multi-layer auto-tuning framework designed for stream processing environments. STREAMLINE uses transformers to predict future workloads and an evolutionary algorithm to automatically tune configuration parameters. It also includes a resource-efficient scheduler that efficiently assigns operators to resources across a compute cluster. Our dynamic update mechanism minimizes downtime and preserves state during configuration parameter and scheduling changes. We evaluate STREAMLINE on the Grid’5000 testbed using real-time IoT and streaming benchmarks. Results show that STREAMLINE outperforms state-of-the-art methods, improving throughput, end-to-end latency, and CPU utilization by up to 4<span><math><mo>×</mo></math></span> , 10<span><math><mo>×</mo></math></span> , and 9<span><math><mo>×</mo></math></span> , respectively, while reducing costs by up to 10<span><math><mo>×</mo></math></span> .</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"34 ","pages":"Article 101731"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525002458","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
With the growing volume of data generated by IoT devices and user-driven services, stream processing has become essential for handling continuous, real-time data. However, fluctuating workloads and the dynamic nature of data streams make it difficult to maintain consistent performance over time, requiring adaptive resource allocation and frequent configuration tuning. Running multiple data stream processing pipelines on shared resources further exacerbates the problem by increasing contention, leading to higher end-to-end latency and reduced performance stability. Most existing approaches focus on tuning individual configuration parameters in isolation and overlook interactions between concurrently running data pipelines. To address these limitations, we present STREAMLINE, a dynamic multi-layer auto-tuning framework designed for stream processing environments. STREAMLINE uses transformers to predict future workloads and an evolutionary algorithm to automatically tune configuration parameters. It also includes a resource-efficient scheduler that efficiently assigns operators to resources across a compute cluster. Our dynamic update mechanism minimizes downtime and preserves state during configuration parameter and scheduling changes. We evaluate STREAMLINE on the Grid’5000 testbed using real-time IoT and streaming benchmarks. Results show that STREAMLINE outperforms state-of-the-art methods, improving throughput, end-to-end latency, and CPU utilization by up to 4 , 10 , and 9 , respectively, while reducing costs by up to 10 .
期刊介绍:
Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT.
The journal will place a high priority on timely publication, and provide a home for high quality.
Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.