高性能云原生分布式对象存储的新型抽象和卸载机制

A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis
{"title":"高性能云原生分布式对象存储的新型抽象和卸载机制","authors":"A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis","doi":"10.1109/IC2E55432.2022.00024","DOIUrl":null,"url":null,"abstract":"Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel Abstraction and Offload Mechanisms for High Performance Cloud-native Distributed Object Stores\",\"authors\":\"A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis\",\"doi\":\"10.1109/IC2E55432.2022.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.\",\"PeriodicalId\":415781,\"journal\":{\"name\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC2E55432.2022.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

对象存储解决方案通常针对容量和成本进行优化,但性能通常是次要考虑的。我们通过a)构建一个抽象层来传递不变性和数据关联提示到底层存储,同时b)使对象层了解使对象存储控制器能够最大化吞吐量的硬件配置,从而实现高性能分布式对象存储。通过这些优化,我们表明对象存储性能可以接近底层原始存储的最大可能性能的95%,同时确保抽象足够通用,能够在任何通用的现成存储系统上运行。我们已经在1000多个不同硬件的客户环境中观察到这些性能提升。我们还使用通用机制扩展了上述优化,以便将计算卸载到更靠近存储的地方,这对各种工作负载都有显著的好处。具体来说,我们评估了a)众所周知的结构(如用于分析工作负载的S3 Select)和b)通用计算卸载(如Objects Lambda)的性能收益。这种卸载计算的能力对于现代分布式工作负载(如AI/ML和分析处理非常大的分布式数据集)至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Novel Abstraction and Offload Mechanisms for High Performance Cloud-native Distributed Object Stores
Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信