高性能云原生分布式对象存储的新型抽象和卸载机制

2022 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2022-09-01 DOI:10.1109/IC2E55432.2022.00024

A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis

{"title":"高性能云原生分布式对象存储的新型抽象和卸载机制","authors":"A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis","doi":"10.1109/IC2E55432.2022.00024","DOIUrl":null,"url":null,"abstract":"Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel Abstraction and Offload Mechanisms for High Performance Cloud-native Distributed Object Stores\",\"authors\":\"A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis\",\"doi\":\"10.1109/IC2E55432.2022.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.\",\"PeriodicalId\":415781,\"journal\":{\"name\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cloud Engineering (IC2E)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC2E55432.2022.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

对象存储解决方案通常针对容量和成本进行优化，但性能通常是次要考虑的。我们通过a)构建一个抽象层来传递不变性和数据关联提示到底层存储，同时b)使对象层了解使对象存储控制器能够最大化吞吐量的硬件配置，从而实现高性能分布式对象存储。通过这些优化，我们表明对象存储性能可以接近底层原始存储的最大可能性能的95%，同时确保抽象足够通用，能够在任何通用的现成存储系统上运行。我们已经在1000多个不同硬件的客户环境中观察到这些性能提升。我们还使用通用机制扩展了上述优化，以便将计算卸载到更靠近存储的地方，这对各种工作负载都有显著的好处。具体来说，我们评估了a)众所周知的结构(如用于分析工作负载的S3 Select)和b)通用计算卸载(如Objects Lambda)的性能收益。这种卸载计算的能力对于现代分布式工作负载(如AI/ML和分析处理非常大的分布式数据集)至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Novel Abstraction and Offload Mechanisms for High Performance Cloud-native Distributed Object Stores

Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Cloud Engineering (IC2E)

自引率

0.00%

发文量