{"title":"分布式流处理的弹性脉冲星函数","authors":"G. Russo, Antonio Schiazza, V. Cardellini","doi":"10.1145/3447545.3451901","DOIUrl":null,"url":null,"abstract":"An increasing number of data-driven applications rely on the ability of processing data flows in a timely manner, exploiting for this purpose Data Stream Processing~(DSP) systems. Elasticity is an essential feature for DSP systems, as workload variability calls for automatic scaling of the application processing capacity, to avoid both overload and resource wastage. In this work, we implement auto-scaling in Pulsar Functions, a function-based streaming framework built on top of Apache Pulsar. The latter is is a distributed publish-subscribe messaging platform that natively supports serverless functions. Considering various state-of-the-art policies, we show that the proposed solution is able to scale application parallelism with minimal overhead.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Elastic Pulsar Functions for Distributed Stream Processing\",\"authors\":\"G. Russo, Antonio Schiazza, V. Cardellini\",\"doi\":\"10.1145/3447545.3451901\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An increasing number of data-driven applications rely on the ability of processing data flows in a timely manner, exploiting for this purpose Data Stream Processing~(DSP) systems. Elasticity is an essential feature for DSP systems, as workload variability calls for automatic scaling of the application processing capacity, to avoid both overload and resource wastage. In this work, we implement auto-scaling in Pulsar Functions, a function-based streaming framework built on top of Apache Pulsar. The latter is is a distributed publish-subscribe messaging platform that natively supports serverless functions. Considering various state-of-the-art policies, we show that the proposed solution is able to scale application parallelism with minimal overhead.\",\"PeriodicalId\":10596,\"journal\":{\"name\":\"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3447545.3451901\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447545.3451901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Elastic Pulsar Functions for Distributed Stream Processing
An increasing number of data-driven applications rely on the ability of processing data flows in a timely manner, exploiting for this purpose Data Stream Processing~(DSP) systems. Elasticity is an essential feature for DSP systems, as workload variability calls for automatic scaling of the application processing capacity, to avoid both overload and resource wastage. In this work, we implement auto-scaling in Pulsar Functions, a function-based streaming framework built on top of Apache Pulsar. The latter is is a distributed publish-subscribe messaging platform that natively supports serverless functions. Considering various state-of-the-art policies, we show that the proposed solution is able to scale application parallelism with minimal overhead.