{"title":"<i>F</i>u<i>n</i>Da: scalable serverless data analytics and in situ query processing.","authors":"Elyes Lounissi, Suvam Kumar Das, Ronnit Peter, Xiaozheng Zhang, Suprio Ray, Lianyin Jia","doi":"10.1186/s40537-025-01141-6","DOIUrl":null,"url":null,"abstract":"<p><p>The pay-what-you-use model of serverless Cloud computing (or serverless, for short) offers significant benefits to the users. This computing paradigm is ideal for short running ephemeral tasks, however, it is not suitable for stateful long running tasks, such as complex data analytics and query processing. We propose <i>F</i>u<i>n</i>Da, an on-premises serverless data analytics framework, which extends our previously proposed system for unified data analytics and in situ SQL query processing called DaskDB. Unlike existing serverless solutions, which struggle with stateful and long running data analytics tasks, <i>F</i>u<i>n</i>Da overcomes their limitations. Our ongoing research focuses on developing a robust architecture for <i>F</i>u<i>n</i>Da, enabling true serverless in on-premises environments, while being able to operate on a public Cloud, such as AWS Cloud. We have evaluated our system on several benchmarks with different scale factors. Our experimental results in both on-premises and AWS Cloud settings demonstrate <i>F</i>u<i>n</i>Da's ability to support automatic scaling, low-latency execution of data analytics workloads, and more flexibility to serverless users.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"12 1","pages":"116"},"PeriodicalIF":8.6000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12064580/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-025-01141-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The pay-what-you-use model of serverless Cloud computing (or serverless, for short) offers significant benefits to the users. This computing paradigm is ideal for short running ephemeral tasks, however, it is not suitable for stateful long running tasks, such as complex data analytics and query processing. We propose FunDa, an on-premises serverless data analytics framework, which extends our previously proposed system for unified data analytics and in situ SQL query processing called DaskDB. Unlike existing serverless solutions, which struggle with stateful and long running data analytics tasks, FunDa overcomes their limitations. Our ongoing research focuses on developing a robust architecture for FunDa, enabling true serverless in on-premises environments, while being able to operate on a public Cloud, such as AWS Cloud. We have evaluated our system on several benchmarks with different scale factors. Our experimental results in both on-premises and AWS Cloud settings demonstrate FunDa's ability to support automatic scaling, low-latency execution of data analytics workloads, and more flexibility to serverless users.
期刊介绍:
The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.