Purity: Building Fast, Highly-Available Enterprise Flash Storage from Commodity Components

John Colgrove, John D. Davis, John Hayes, E. L. Miller, C. Sandvig, R. Sears, Ariel Tamches, Neil Vachharajani, Feng Wang
{"title":"Purity: Building Fast, Highly-Available Enterprise Flash Storage from Commodity Components","authors":"John Colgrove, John D. Davis, John Hayes, E. L. Miller, C. Sandvig, R. Sears, Ariel Tamches, Neil Vachharajani, Feng Wang","doi":"10.1145/2723372.2742798","DOIUrl":null,"url":null,"abstract":"Although flash storage has largely replaced hard disks in consumer class devices, enterprise workloads pose unique challenges that have slowed adoption of flash in ``performance tier'' storage appliances. In this paper, we describe Purity, the foundation of Pure Storage's Flash Arrays, the first all-flash enterprise storage system to support compression, deduplication, and high-availability. Purity borrows techniques from modern database and key-value storage architectures, and introduces novel storage primitives that have wide applicability to data management systems. For instance, all writes in Purity are monotonic, and deletions are handled using an atomic predicate-based tuple elision primitive. Purity's redundancy mechanisms are optimized for SSD failure modes and performance characteristics, allowing for fast recovery from component failures and lower space overhead than the best hard disk systems. We built deduplication and data compression schemes atop these primitives. Flash changes storage capacity/performance tradeoffs: unlike disk-based systems, flash deployments are rarely performance bound. A single Purity appliance can provide over 7GiB/s of throughput on 32KiB random I/Os, even through multiple device failures, and while providing asynchronous off-site replication. Typical installations have 99.9% latencies under 1ms, and production arrays average 5.4x data reduction and 99.999% availability. Purity takes advantage of storage performance increasing more rapidly than computational performance to build a simpler (with respect to engineering, installation, and management) scale-up storage appliance that supports hundreds of terabytes of highly-available, high-performance storage. The resulting performance and capacity supports many customer deployments of multiple applications, including scale-out and parallel systems, such as MongoDB and Oracle RAC, on a single Purity appliance.","PeriodicalId":168391,"journal":{"name":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2723372.2742798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 62

Abstract

Although flash storage has largely replaced hard disks in consumer class devices, enterprise workloads pose unique challenges that have slowed adoption of flash in ``performance tier'' storage appliances. In this paper, we describe Purity, the foundation of Pure Storage's Flash Arrays, the first all-flash enterprise storage system to support compression, deduplication, and high-availability. Purity borrows techniques from modern database and key-value storage architectures, and introduces novel storage primitives that have wide applicability to data management systems. For instance, all writes in Purity are monotonic, and deletions are handled using an atomic predicate-based tuple elision primitive. Purity's redundancy mechanisms are optimized for SSD failure modes and performance characteristics, allowing for fast recovery from component failures and lower space overhead than the best hard disk systems. We built deduplication and data compression schemes atop these primitives. Flash changes storage capacity/performance tradeoffs: unlike disk-based systems, flash deployments are rarely performance bound. A single Purity appliance can provide over 7GiB/s of throughput on 32KiB random I/Os, even through multiple device failures, and while providing asynchronous off-site replication. Typical installations have 99.9% latencies under 1ms, and production arrays average 5.4x data reduction and 99.999% availability. Purity takes advantage of storage performance increasing more rapidly than computational performance to build a simpler (with respect to engineering, installation, and management) scale-up storage appliance that supports hundreds of terabytes of highly-available, high-performance storage. The resulting performance and capacity supports many customer deployments of multiple applications, including scale-out and parallel systems, such as MongoDB and Oracle RAC, on a single Purity appliance.
纯度:从商品组件构建快速,高可用性的企业闪存
尽管闪存在消费级设备中已经很大程度上取代了硬盘,但企业工作负载带来了独特的挑战,减缓了闪存在“性能层”存储设备中的采用。在本文中,我们描述了Pure, Pure Storage闪存阵列的基础,这是第一个支持压缩、重复数据删除和高可用性的全闪存企业存储系统。Purity借鉴了现代数据库和键值存储体系结构中的技术,并引入了广泛适用于数据管理系统的新颖存储原语。例如,pure中的所有写操作都是单调的,而删除操作则使用基于原子谓词的元组省略原语处理。Purity的冗余机制针对SSD故障模式和性能特征进行了优化,允许从组件故障中快速恢复,并且比最佳硬盘系统的空间开销更低。我们在这些原语的基础上构建了重复数据删除和数据压缩方案。闪存改变了存储容量/性能权衡:与基于磁盘的系统不同,闪存部署很少受到性能限制。单个Purity设备可以在32KiB随机I/ o上提供超过7GiB/s的吞吐量,即使在多个设备出现故障的情况下也是如此,同时还提供异步异地复制。典型的安装在1ms以下有99.9%的延迟,生产阵列平均数据减少了5.4倍,可用性达到99.999%。Purity利用存储性能比计算性能增长更快的优势,构建一个更简单(在工程、安装和管理方面)的可扩展存储设备,支持数百tb的高可用性高性能存储。由此产生的性能和容量支持多个应用程序的许多客户部署,包括横向扩展和并行系统,如MongoDB和Oracle RAC,在单个Purity设备上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信