Offline and Online Algorithms for SSD Management

Proceedings of the ACM on Measurement and Analysis of Computing Systems Pub Date : 2021-12-14 DOI:10.1145/3491045

Tomer Lange, J. Naor, G. Yadgar

{"title":"Offline and Online Algorithms for SSD Management","authors":"Tomer Lange, J. Naor, G. Yadgar","doi":"10.1145/3491045","DOIUrl":null,"url":null,"abstract":"Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.

查看原文本刊更多论文

SSD管理的离线和在线算法

基于闪存的固态硬盘(ssd)已经在大型数据中心的基础设施以及商用服务器和个人设备中发挥了核心作用。flash媒体的主要限制是它不能支持就地更新:在数据被写入物理位置之后，必须在新数据被写入之前将其擦除。此外，ssd支持以页面粒度进行读写操作，而擦除操作是在整个块上执行的，这些块通常包含数百个页面。当擦除一个块时，它存储的任何有效数据都必须重写到一个干净的位置。随着擦除次数的增加，SSD最终会磨损，因此管理算法的效率对SSD的耐用性有很大的影响。本文首先正式定义了固态硬盘的管理问题。然后，我们从算法的角度探讨这个问题，在离线和在线设置中考虑它。在离线设置中，我们提出了一个近乎最优的算法，给定任何输入，执行重写的次数可以忽略不计(相对于输入长度)。我们还讨论了离线问题的硬度。在在线设置中，我们首先考虑对输入没有先验知识的算法。我们证明了在这种情况下没有确定性算法优于贪婪算法，并讨论了随机化可能带来的好处。然后我们扩展我们的模型，假设对页面的每个请求到达时都预测了页面的下一次更新时间。我们设计了一个使用这种预测的在线算法，并表明其性能随着预测误差的减小而提高。我们还表明，即使在预测误差很大的情况下，我们的算法的性能也不会比贪婪算法所保证的性能差。我们通过对算法的实证评估来补充我们的理论发现，并将它们与最先进的方案进行比较。结果证实，我们的算法在大范围的输入迹线中表现出改进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM on Measurement and Analysis of Computing Systems

CiteScore

3.20

自引率

0.00%

发文量