SaaMS：概要即微服务范例，用于从云到边缘连续体的可扩展自适应流分析

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems Pub Date : 2025-09-23 DOI:10.1016/j.is.2025.102629

Georgios Panagiotis Kalfakis, Nikos Giatrakos

{"title":"SaaMS：概要即微服务范例，用于从云到边缘连续体的可扩展自适应流分析","authors":"Georgios Panagiotis Kalfakis, Nikos Giatrakos","doi":"10.1016/j.is.2025.102629","DOIUrl":null,"url":null,"abstract":"<div><div>The use of data synopses in Big streaming Data analytics can offer 3 types of scalability: (i) horizontal scalability, for scaling with the volume and velocity of Big streaming Data, (ii) vertical scalability, for scaling with the number of processed streams, and (iii) federated scalability, i.e. reducing the communication cost for performing global analytics across a number of geo-distributed data centers or devices in IoT settings. Despite the aforementioned virtues of synopses, no state-of-the-art Big Data framework or IoT platform provides a native API for stream synopses supporting all three types of required scalability. In this work, we fill this gap by introducing a novel system and architectural paradigm, namely Synopses-as-a-MicroService (SaaMS), for both parallel and geo-distributed stream summarization at scale. SaaMS is developed on Apache Kafka and Kafka Streams and can provide all the required types of scalability together with (i) the ability to seamlessly perform adaptive resource allocation with zero downtime for the running analytics and (ii) the ability to run both across powerful computer clusters and Java-enabled IoT devices. Therefore, SaaMS is directly deployable from applications that either operate on powerful clouds or across the cloud to edge continuum.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102629"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SaaMS: The synopses-as-a-microservice paradigm for scalable adaptive streaming analytics across the cloud to edge continuum\",\"authors\":\"Georgios Panagiotis Kalfakis, Nikos Giatrakos\",\"doi\":\"10.1016/j.is.2025.102629\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The use of data synopses in Big streaming Data analytics can offer 3 types of scalability: (i) horizontal scalability, for scaling with the volume and velocity of Big streaming Data, (ii) vertical scalability, for scaling with the number of processed streams, and (iii) federated scalability, i.e. reducing the communication cost for performing global analytics across a number of geo-distributed data centers or devices in IoT settings. Despite the aforementioned virtues of synopses, no state-of-the-art Big Data framework or IoT platform provides a native API for stream synopses supporting all three types of required scalability. In this work, we fill this gap by introducing a novel system and architectural paradigm, namely Synopses-as-a-MicroService (SaaMS), for both parallel and geo-distributed stream summarization at scale. SaaMS is developed on Apache Kafka and Kafka Streams and can provide all the required types of scalability together with (i) the ability to seamlessly perform adaptive resource allocation with zero downtime for the running analytics and (ii) the ability to run both across powerful computer clusters and Java-enabled IoT devices. Therefore, SaaMS is directly deployable from applications that either operate on powerful clouds or across the cloud to edge continuum.</div></div>\",\"PeriodicalId\":50363,\"journal\":{\"name\":\"Information Systems\",\"volume\":\"136 \",\"pages\":\"Article 102629\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306437925001152\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437925001152","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在大流数据分析中使用数据概要可以提供3种类型的可扩展性：(i)水平可扩展性，用于根据大流数据的数量和速度进行扩展；（ii）垂直可扩展性，用于根据处理流的数量进行扩展；（iii）联合可扩展性，即减少跨多个地理分布式数据中心或物联网设置中的设备执行全球分析的通信成本。尽管有上述概要的优点，但没有最先进的大数据框架或物联网平台为流概要提供原生API，支持所有三种类型所需的可扩展性。在这项工作中，我们通过引入一种新的系统和架构范例来填补这一空白，即概要即微服务（SaaMS），用于大规模并行和地理分布式流汇总。SaaMS是在Apache Kafka和Kafka Streams上开发的，可以提供所有所需类型的可扩展性，以及(i)无缝执行自适应资源分配的能力，零停机时间，以及（ii）跨强大的计算机集群和支持java的物联网设备运行的能力。因此，可以从运行在强大的云上或跨云到边缘连续体的应用程序直接部署sam。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SaaMS: The synopses-as-a-microservice paradigm for scalable adaptive streaming analytics across the cloud to edge continuum

The use of data synopses in Big streaming Data analytics can offer 3 types of scalability: (i) horizontal scalability, for scaling with the volume and velocity of Big streaming Data, (ii) vertical scalability, for scaling with the number of processed streams, and (iii) federated scalability, i.e. reducing the communication cost for performing global analytics across a number of geo-distributed data centers or devices in IoT settings. Despite the aforementioned virtues of synopses, no state-of-the-art Big Data framework or IoT platform provides a native API for stream synopses supporting all three types of required scalability. In this work, we fill this gap by introducing a novel system and architectural paradigm, namely Synopses-as-a-MicroService (SaaMS), for both parallel and geo-distributed stream summarization at scale. SaaMS is developed on Apache Kafka and Kafka Streams and can provide all the required types of scalability together with (i) the ability to seamlessly perform adaptive resource allocation with zero downtime for the running analytics and (ii) the ability to run both across powerful computer clusters and Java-enabled IoT devices. Therefore, SaaMS is directly deployable from applications that either operate on powerful clouds or across the cloud to edge continuum.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

2.70%

发文量

112

审稿时长

53 days

期刊介绍： Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems. Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.