Apache Wayang: A Unified Data Analytics Framework

IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Kaustubh Beedkar, Bertty Contreras-Rojas, Haralampos Gavriilidis, Zoi Kaoudi, Volker Markl, Rodrigo Pardo-Meza, Jorge-Arnulfo Quiané-Ruiz
{"title":"Apache Wayang: A Unified Data Analytics Framework","authors":"Kaustubh Beedkar, Bertty Contreras-Rojas, Haralampos Gavriilidis, Zoi Kaoudi, Volker Markl, Rodrigo Pardo-Meza, Jorge-Arnulfo Quiané-Ruiz","doi":"10.1145/3631504.3631510","DOIUrl":null,"url":null,"abstract":"The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform( s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture ofWayang, describe its main components, and give an outlook on future directions.","PeriodicalId":49524,"journal":{"name":"Sigmod Record","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sigmod Record","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631504.3631510","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform( s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture ofWayang, describe its main components, and give an outlook on future directions.
Apache Wayang:统一的数据分析框架
各种各样的专业数据处理平台和数据分析的复杂性增加导致需要在单一框架内统一数据分析。这样的框架应该让用户从以下两个负担中解脱出来:(1)选择正确的平台;(2)在管道的不同部分之间粘接代码。Apache Wayang (Incubating)是唯一一个通过集成多个异构数据处理平台,为统一数据分析提供系统解决方案的开源框架。它通过将应用程序与底层平台解耦并提供优化器来实现这一点,这样用户就不必指定他们的管道应该在哪个平台上运行。Wayang提供了一个统一的视图和处理模型,有效地将异构平台的大杂烩集成到一个框架中,在不牺牲性能和总拥有成本的情况下提高了可用性。在本文中,我们介绍了大阳的架构,描述了它的主要组成部分,并对未来的发展方向进行了展望。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Sigmod Record
Sigmod Record 工程技术-计算机:软件工程
CiteScore
3.10
自引率
9.10%
发文量
41
审稿时长
>12 weeks
期刊介绍: SIGMOD investigates the development and application of database technology to support the full range of data management needs. The scope of interests and members is wide with an almost equal mix of people from industryand academia. SIGMOD sponsors an annual conference that is regarded as one of the most important in the field, particularly for practitioners. Areas of Special Interest: Active and temporal data management, data mining and models, database programming languages, databases on the WWW, distributed data management, engineering, federated multi-database and mobile management, query processing & optimization, rapid application development tools, spatial data management, user interfaces.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信