基于FaaS的分布式流处理关键技术研究

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Concurrency and Computation-Practice & Experience Pub Date : 2025-09-03 DOI:10.1002/cpe.70274

Qinlu He, Fan Zhang, Genqing Bian, Weiqi Zhang, Zhen Li

{"title":"基于FaaS的分布式流处理关键技术研究","authors":"Qinlu He, Fan Zhang, Genqing Bian, Weiqi Zhang, Zhen Li","doi":"10.1002/cpe.70274","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Serverless computing has emerged as a promising paradigm for cloud-based stream processing applications characterized by fluctuating workloads and latency sensitivity. While existing Function-as-a-Service (FaaS) implementations primarily focus on homogeneous CPU/memory resource scaling, they fail to address the challenges of heterogeneous resource management and coordinated elasticity in distributed stream processing. This study proposes HFaaS, a novel serverless framework that integrates dataflow programming with heterogeneous resource orchestration for stream processing applications. The key innovations include: (1) a dataflow-oriented function composition model enabling dynamic scaling of individual processing stages through peer-to-point communication mechanisms, (2) a fine-grained GPU resource allocation strategy achieving 15% + utilization improvement through device sharing and elastic scaling capabilities, and (3) a locality-aware scheduling algorithm optimizing task placement based on data proximity and heterogeneous resource availability. Experimental results demonstrate that HFaaS effectively coordinates multi-stage function scaling while maintaining sub-second latency guarantees. The proposed resource allocation strategy improves GPU utilization by 15.2% compared to conventional static allocation approaches, with network overhead reduced by 31.6% through data-local scheduling. This work bridges the gap between serverless architectures and modern stream processing requirements, providing a unified platform for building resource-efficient, latency-sensitive distributed applications in heterogeneous cloud environments.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 23-24","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research of Key Technologies of Distributed Stream Processing Based on FaaS\",\"authors\":\"Qinlu He, Fan Zhang, Genqing Bian, Weiqi Zhang, Zhen Li\",\"doi\":\"10.1002/cpe.70274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Serverless computing has emerged as a promising paradigm for cloud-based stream processing applications characterized by fluctuating workloads and latency sensitivity. While existing Function-as-a-Service (FaaS) implementations primarily focus on homogeneous CPU/memory resource scaling, they fail to address the challenges of heterogeneous resource management and coordinated elasticity in distributed stream processing. This study proposes HFaaS, a novel serverless framework that integrates dataflow programming with heterogeneous resource orchestration for stream processing applications. The key innovations include: (1) a dataflow-oriented function composition model enabling dynamic scaling of individual processing stages through peer-to-point communication mechanisms, (2) a fine-grained GPU resource allocation strategy achieving 15% + utilization improvement through device sharing and elastic scaling capabilities, and (3) a locality-aware scheduling algorithm optimizing task placement based on data proximity and heterogeneous resource availability. Experimental results demonstrate that HFaaS effectively coordinates multi-stage function scaling while maintaining sub-second latency guarantees. The proposed resource allocation strategy improves GPU utilization by 15.2% compared to conventional static allocation approaches, with network overhead reduced by 31.6% through data-local scheduling. This work bridges the gap between serverless architectures and modern stream processing requirements, providing a unified platform for building resource-efficient, latency-sensitive distributed applications in heterogeneous cloud environments.</p>\\n </div>\",\"PeriodicalId\":55214,\"journal\":{\"name\":\"Concurrency and Computation-Practice & Experience\",\"volume\":\"37 23-24\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Concurrency and Computation-Practice & Experience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70274\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70274","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

对于以工作负载波动和延迟敏感性为特征的基于云的流处理应用程序，无服务器计算已经成为一种很有前途的范例。虽然现有的功能即服务（FaaS）实现主要关注同构的CPU/内存资源扩展，但它们无法解决分布式流处理中异构资源管理和协调弹性的挑战。本研究提出了HFaaS，一种新的无服务器框架，它将数据流编程与流处理应用程序的异构资源编排集成在一起。关键创新包括：(1)面向数据流的功能组合模型，通过点对点通信机制实现各个处理阶段的动态扩展；(2)通过设备共享和弹性扩展能力实现15%以上利用率提高的细粒度GPU资源分配策略；(3)基于数据接近性和异构资源可用性优化任务放置的位置感知调度算法。实验结果表明，HFaaS可以有效地协调多阶段功能扩展，同时保持亚秒级的延迟保证。与传统的静态分配方法相比，所提出的资源分配策略使GPU利用率提高了15.2%，通过数据本地调度减少了31.6%的网络开销。这项工作弥合了无服务器架构和现代流处理需求之间的差距，为在异构云环境中构建资源高效、延迟敏感的分布式应用程序提供了一个统一的平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Research of Key Technologies of Distributed Stream Processing Based on FaaS

Serverless computing has emerged as a promising paradigm for cloud-based stream processing applications characterized by fluctuating workloads and latency sensitivity. While existing Function-as-a-Service (FaaS) implementations primarily focus on homogeneous CPU/memory resource scaling, they fail to address the challenges of heterogeneous resource management and coordinated elasticity in distributed stream processing. This study proposes HFaaS, a novel serverless framework that integrates dataflow programming with heterogeneous resource orchestration for stream processing applications. The key innovations include: (1) a dataflow-oriented function composition model enabling dynamic scaling of individual processing stages through peer-to-point communication mechanisms, (2) a fine-grained GPU resource allocation strategy achieving 15% + utilization improvement through device sharing and elastic scaling capabilities, and (3) a locality-aware scheduling algorithm optimizing task placement based on data proximity and heterogeneous resource availability. Experimental results demonstrate that HFaaS effectively coordinates multi-stage function scaling while maintaining sub-second latency guarantees. The proposed resource allocation strategy improves GPU utilization by 15.2% compared to conventional static allocation approaches, with network overhead reduced by 31.6% through data-local scheduling. This work bridges the gap between serverless architectures and modern stream processing requirements, providing a unified platform for building resource-efficient, latency-sensitive distributed applications in heterogeneous cloud environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Concurrency and Computation-Practice & Experience 工程技术-计算机：理论方法

CiteScore

5.00

自引率

10.00%

发文量

664

审稿时长

9.6 months

期刊介绍： Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of: Parallel and distributed computing; High-performance computing; Computational and data science; Artificial intelligence and machine learning; Big data applications, algorithms, and systems; Network science; Ontologies and semantics; Security and privacy; Cloud/edge/fog computing; Green computing; and Quantum computing.