Smart Data Placement for Big Data Pipelines: An Approach based on the Storage-as-a-Service Model

A. Khan, Nikolay Nikolov, M. Matskin, R.-C. Prodan, Hui Song, D. Roman, A. Soylu
{"title":"Smart Data Placement for Big Data Pipelines: An Approach based on the Storage-as-a-Service Model","authors":"A. Khan, Nikolay Nikolov, M. Matskin, R.-C. Prodan, Hui Song, D. Roman, A. Soylu","doi":"10.1109/UCC56403.2022.00056","DOIUrl":null,"url":null,"abstract":"The development of big data pipelines is a challenging task, especially when data storage is considered as part of the data pipelines. Local storage is expensive, hard to maintain, comes with several challenges (e.g., data availability, data security, and backup). The use of cloud storage, i.e., Storageas-a-Service (StaaS), instead of local storage has the potential of providing more flexibility in terms of such as scalability, fault tolerance, and availability. In this paper, we propose a generic approach to integrate StaaS with data pipelines, i.e., computation on an on-premise server or on a specific cloud, but integration with StaaS, and develop a ranking method for available storage options based on five key parameters: cost, proximity, network performance, the impact of server-side encryption, and user weights. The evaluation carried out demonstrates the effectiveness of the proposed approach in terms of data transfer performance and the feasibility of dynamic selection of a storage option based on four primary user scenarios.","PeriodicalId":203244,"journal":{"name":"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC56403.2022.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The development of big data pipelines is a challenging task, especially when data storage is considered as part of the data pipelines. Local storage is expensive, hard to maintain, comes with several challenges (e.g., data availability, data security, and backup). The use of cloud storage, i.e., Storageas-a-Service (StaaS), instead of local storage has the potential of providing more flexibility in terms of such as scalability, fault tolerance, and availability. In this paper, we propose a generic approach to integrate StaaS with data pipelines, i.e., computation on an on-premise server or on a specific cloud, but integration with StaaS, and develop a ranking method for available storage options based on five key parameters: cost, proximity, network performance, the impact of server-side encryption, and user weights. The evaluation carried out demonstrates the effectiveness of the proposed approach in terms of data transfer performance and the feasibility of dynamic selection of a storage option based on four primary user scenarios.
大数据管道的智能数据放置:基于存储即服务模型的方法
大数据管道的发展是一项具有挑战性的任务,特别是当数据存储被视为数据管道的一部分时。本地存储价格昂贵,难以维护,并带来了一些挑战(例如,数据可用性、数据安全性和备份)。使用云存储,即存储即服务(StaaS),而不是本地存储,有可能在可伸缩性、容错性和可用性等方面提供更大的灵活性。在本文中,我们提出了一种将StaaS与数据管道集成的通用方法,即在本地服务器或特定云上进行计算,但与StaaS集成,并基于五个关键参数开发可用存储选项的排名方法:成本,邻近性,网络性能,服务器端加密的影响和用户权重。所进行的评估证明了所提出方法在数据传输性能方面的有效性,以及基于四种主要用户场景动态选择存储选项的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信