SDF: software-defined flash for web-scale internet storage systems

Ouyang Jian, Shiding Lin, Song Jiang, Zhenyu Hou, Yong Wang, Yuanzheng Wang
{"title":"SDF: software-defined flash for web-scale internet storage systems","authors":"Ouyang Jian, Shiding Lin, Song Jiang, Zhenyu Hou, Yong Wang, Yuanzheng Wang","doi":"10.1145/2541940.2541959","DOIUrl":null,"url":null,"abstract":"In the last several years hundreds of thousands of SSDs have been deployed in the data centers of Baidu, China's largest Internet search company. Currently only 40\\% or less of the raw bandwidth of the flash memory in the SSDs is delivered by the storage system to the applications. Moreover, because of space over-provisioning in the SSD to accommodate non-sequential or random writes, and additionally, parity coding across flash channels, typically only 50-70\\% of the raw capacity of a commodity SSD can be used for user data. Given the large scale of Baidu's data center, making the most effective use of its SSDs is of great importance. Specifically, we seek to maximize both bandwidth and usable capacity. To achieve this goal we propose {\\em software-defined flash} (SDF), a hardware/software co-designed storage system to maximally exploit the performance characteristics of flash memory in the context of our workloads. SDF exposes individual flash channels to the host software and eliminates space over-provisioning. The host software, given direct access to the raw flash channels of the SSD, can effectively organize its data and schedule its data access to better realize the SSD's raw performance potential. Currently more than 3000 SDFs have been deployed in Baidu's storage system that supports its web page and image repository services. Our measurements show that SDF can deliver approximately 95% of the raw flash bandwidth and provide 99% of the flash capacity for user data. SDF increases I/O bandwidth by 300\\% and reduces per-GB hardware cost by 50% on average compared with the commodity-SSD-based system used at Baidu.","PeriodicalId":128805,"journal":{"name":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","volume":"219 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"219","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2541940.2541959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 219

Abstract

In the last several years hundreds of thousands of SSDs have been deployed in the data centers of Baidu, China's largest Internet search company. Currently only 40\% or less of the raw bandwidth of the flash memory in the SSDs is delivered by the storage system to the applications. Moreover, because of space over-provisioning in the SSD to accommodate non-sequential or random writes, and additionally, parity coding across flash channels, typically only 50-70\% of the raw capacity of a commodity SSD can be used for user data. Given the large scale of Baidu's data center, making the most effective use of its SSDs is of great importance. Specifically, we seek to maximize both bandwidth and usable capacity. To achieve this goal we propose {\em software-defined flash} (SDF), a hardware/software co-designed storage system to maximally exploit the performance characteristics of flash memory in the context of our workloads. SDF exposes individual flash channels to the host software and eliminates space over-provisioning. The host software, given direct access to the raw flash channels of the SSD, can effectively organize its data and schedule its data access to better realize the SSD's raw performance potential. Currently more than 3000 SDFs have been deployed in Baidu's storage system that supports its web page and image repository services. Our measurements show that SDF can deliver approximately 95% of the raw flash bandwidth and provide 99% of the flash capacity for user data. SDF increases I/O bandwidth by 300\% and reduces per-GB hardware cost by 50% on average compared with the commodity-SSD-based system used at Baidu.
用于网络规模的互联网存储系统的软件定义闪存
在过去的几年里,中国最大的互联网搜索公司百度的数据中心已经部署了数十万块固态硬盘。目前只有40%或更少的ssd闪存的原始带宽是由存储系统提供给应用程序的。此外,由于SSD中的空间过度配置以适应非顺序或随机写入,另外,跨闪存通道的奇偶校验编码,通常只有商用SSD原始容量的50- 70%可用于用户数据。考虑到百度数据中心的规模,最有效地利用其ssd非常重要。具体来说,我们寻求最大限度地提高带宽和可用容量。为了实现这一目标,我们提出了软件定义闪存(SDF),这是一种硬件/软件协同设计的存储系统,可以在我们的工作负载环境中最大限度地利用闪存的性能特征。SDF将单个闪存通道暴露给主机软件,并消除了空间过度供应。主机软件直接访问SSD的原始闪存通道,可以有效地组织其数据并调度其数据访问,从而更好地发挥SSD的原始性能潜力。目前,b百度的存储系统中已经部署了3000多个sdf,支持其网页和图像存储库服务。我们的测量表明,SDF可以提供大约95%的原始闪存带宽,并为用户数据提供99%的闪存容量。与百度使用的基于ssd的普通系统相比,SDF将I/O带宽提高了300%,每gb硬件成本平均降低了50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信