Machine Learning Systems: A Survey from a Data-Oriented Perspective

IF 28 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, Neil Lawrence
{"title":"Machine Learning Systems: A Survey from a Data-Oriented Perspective","authors":"Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, Neil Lawrence","doi":"10.1145/3769292","DOIUrl":null,"url":null,"abstract":"Engineers are deploying ML models as parts of real-world systems with the upsurge of AI technologies. Real-world environments challenge the deployment of such systems because these environments produce large amounts of heterogeneous data, and users require increasingly efficient responses. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-Oriented Architecture (DOA) is an emerging style that better equips systems to integrate ML models. Even though papers on deployed ML-based systems do not mention DOA, their authors make design decisions that implicitly follow DOA. Implicit decisions create a knowledge gap, limiting practitioners’ ability to implement ML-based systems. This paper surveys why, how, and to what extent practitioners have adopted DOA to implement ML-based systems. We overcome the knowledge gap by answering these questions and explicitly showing the design decisions and practices behind these systems. The survey follows a well-known systematic and semi-automated methodology for reviewing papers in software engineering. The majority of reviewed works partially adopt DOA. Such an adoption enables systems to address big data management, low-latency processing, resource management, security, and privacy requirements. Based on these findings, we formulate practical advice to facilitate the deployment of ML-based systems.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"28 1","pages":""},"PeriodicalIF":28.0000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3769292","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Engineers are deploying ML models as parts of real-world systems with the upsurge of AI technologies. Real-world environments challenge the deployment of such systems because these environments produce large amounts of heterogeneous data, and users require increasingly efficient responses. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-Oriented Architecture (DOA) is an emerging style that better equips systems to integrate ML models. Even though papers on deployed ML-based systems do not mention DOA, their authors make design decisions that implicitly follow DOA. Implicit decisions create a knowledge gap, limiting practitioners’ ability to implement ML-based systems. This paper surveys why, how, and to what extent practitioners have adopted DOA to implement ML-based systems. We overcome the knowledge gap by answering these questions and explicitly showing the design decisions and practices behind these systems. The survey follows a well-known systematic and semi-automated methodology for reviewing papers in software engineering. The majority of reviewed works partially adopt DOA. Such an adoption enables systems to address big data management, low-latency processing, resource management, security, and privacy requirements. Based on these findings, we formulate practical advice to facilitate the deployment of ML-based systems.
机器学习系统:从面向数据的角度进行调查
随着人工智能技术的兴起,工程师们正在将机器学习模型作为现实世界系统的一部分进行部署。现实环境对此类系统的部署提出了挑战,因为这些环境会产生大量异构数据,并且用户需要越来越高效的响应。在部署基于ml的系统时,这些需求将流行的软件架构推向了极限。面向数据的体系结构(DOA)是一种新兴的风格,它使系统能够更好地集成机器学习模型。尽管关于部署的基于ml的系统的论文没有提到DOA,但它们的作者做出的设计决策隐含地遵循了DOA。隐性决策造成了知识鸿沟,限制了从业者实现基于ml的系统的能力。本文调查了从业者为什么、如何以及在多大程度上采用DOA来实现基于ml的系统。我们通过回答这些问题并明确地展示这些系统背后的设计决策和实践来克服知识差距。该调查遵循了一种众所周知的用于审查软件工程论文的系统化和半自动化方法。大多数被审查的作品部分采用了DOA。这样的采用使系统能够解决大数据管理、低延迟处理、资源管理、安全和隐私需求。根据这些发现,我们提出了实用的建议,以促进基于机器学习的系统的部署。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Computing Surveys
ACM Computing Surveys 工程技术-计算机:理论方法
CiteScore
33.20
自引率
0.60%
发文量
372
审稿时长
12 months
期刊介绍: ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信