Leveraging the serverless paradigm for realizing machine learning pipelines across the edge-cloud continuum

2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN) Pub Date : 2021-03-01 DOI:10.1109/ICIN51074.2021.9385525

Efterpi Paraskevoulakou, D. Kyriazis

{"title":"Leveraging the serverless paradigm for realizing machine learning pipelines across the edge-cloud continuum","authors":"Efterpi Paraskevoulakou, D. Kyriazis","doi":"10.1109/ICIN51074.2021.9385525","DOIUrl":null,"url":null,"abstract":"The exceedingly exponential-growing data rate highlighted numerous requirements and several approaches have been released to maximize the added-value of cloud and edge resources. Whereas data scientists utilize algorithmic models in order to transform datasets and extract actionable knowledge, a key challenge is oriented towards abstracting the underline layers: the ones enabling the management of infrastructure resources and the ones responsible to provide frameworks and components as services. In this sense, the serverless approach features as the novel paradigm of new cloud-related technology, enabling the agile implementation of applications and services. The concept of Function as a Service (FaaS) is introduced as a revolutionary model that offers the means to exploit serverless offerings. Developers have the potential to design their applications with the necessary scalability in the form of nanoservices without addressing themselves the way the infrastructure resources should be deployed and managed. By abstracting away the underlying hardware allocations, the data scientist concentrates on the business logic and critical problems of Machine Learning (ML) algorithms. This paper introduces an approach to realize the provision of ML Functions as a Service (i.e., ML-FaaS), by exploiting the Apache OpenWhisk event-driven, distributed serverless platform. The presented approach tackles also composite services that consist of single ones i.e., workflows of ML tasks including processes such as aggregation, cleaning, feature extraction, and analytics; thus, reflecting the complete data path. We also illustrate the operation of the approach mentioned above and assess its performance and effectiveness exploiting a holistic, end-toend anti-fraud detection machine learning pipeline.","PeriodicalId":347933,"journal":{"name":"2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIN51074.2021.9385525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The exceedingly exponential-growing data rate highlighted numerous requirements and several approaches have been released to maximize the added-value of cloud and edge resources. Whereas data scientists utilize algorithmic models in order to transform datasets and extract actionable knowledge, a key challenge is oriented towards abstracting the underline layers: the ones enabling the management of infrastructure resources and the ones responsible to provide frameworks and components as services. In this sense, the serverless approach features as the novel paradigm of new cloud-related technology, enabling the agile implementation of applications and services. The concept of Function as a Service (FaaS) is introduced as a revolutionary model that offers the means to exploit serverless offerings. Developers have the potential to design their applications with the necessary scalability in the form of nanoservices without addressing themselves the way the infrastructure resources should be deployed and managed. By abstracting away the underlying hardware allocations, the data scientist concentrates on the business logic and critical problems of Machine Learning (ML) algorithms. This paper introduces an approach to realize the provision of ML Functions as a Service (i.e., ML-FaaS), by exploiting the Apache OpenWhisk event-driven, distributed serverless platform. The presented approach tackles also composite services that consist of single ones i.e., workflows of ML tasks including processes such as aggregation, cleaning, feature extraction, and analytics; thus, reflecting the complete data path. We also illustrate the operation of the approach mentioned above and assess its performance and effectiveness exploiting a holistic, end-toend anti-fraud detection machine learning pipeline.

查看原文本刊更多论文

利用无服务器范例实现跨边缘云连续体的机器学习管道

呈指数级增长的数据速率突出了许多需求，并且已经发布了几种方法来最大化云和边缘资源的附加价值。虽然数据科学家利用算法模型来转换数据集和提取可操作的知识，但一个关键的挑战是面向抽象底层:那些能够管理基础设施资源的层和那些负责提供框架和组件作为服务的层。从这个意义上说，无服务器方法的特点是作为新的云相关技术的新范例，支持应用程序和服务的敏捷实现。功能即服务(FaaS)的概念是作为一种革命性的模型引入的，它提供了利用无服务器产品的方法。开发人员有可能以纳米服务的形式设计具有必要可伸缩性的应用程序，而无需解决基础设施资源应该如何部署和管理的问题。通过抽象底层硬件分配，数据科学家可以专注于业务逻辑和机器学习算法的关键问题。本文介绍了一种利用Apache OpenWhisk事件驱动的分布式无服务器平台实现ML功能即服务(即ML- faas)的方法。所提出的方法还处理由单个服务组成的组合服务，即ML任务的工作流，包括聚合、清理、特征提取和分析等过程;因此，反映了完整的数据路径。我们还说明了上述方法的操作，并利用整体的端到端反欺诈检测机器学习管道评估其性能和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)

自引率

0.00%

发文量