TDDFS

ACM Transactions on Storage (TOS) Pub Date : 2019-02-05 DOI:10.1145/3295461

Zhichao Cao, Hao Wen, Xiongzi Ge, Jingwei Ma, Jim Diehl, D. Du

{"title":"TDDFS","authors":"Zhichao Cao, Hao Wen, Xiongzi Ge, Jingwei Ma, Jim Diehl, D. Du","doi":"10.1145/3295461","DOIUrl":null,"url":null,"abstract":"With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used to store frequently-accessed (active) data while a large amount of less-active data are stored in the lower-performance and low-cost tier (the slow tier). Data are migrated between these two tiers according to their activity. In this article, we propose a Tier-aware Data Deduplication-based File System, called TDDFS, which can operate efficiently on top of a two-tier storage environment. Specifically, to achieve better performance, nearly all file operations are performed in the fast tier. To achieve higher cost-effectiveness, files are migrated from the fast tier to the slow tier if they are no longer active, and this migration is done with data deduplication. The distinctiveness of our design is that it maintains the non-redundant (unique) chunks produced by data deduplication in both tiers if possible. When a file is reloaded (called a reloaded file) from the slow tier to the fast tier, if some data chunks of the file already exist in the fast tier, then the data migration of these chunks from the slow tier can be avoided. Our evaluation shows that TDDFS achieves close to the best overall performance among various file-tiering designs for two-tier storage systems.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"132 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3295461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used to store frequently-accessed (active) data while a large amount of less-active data are stored in the lower-performance and low-cost tier (the slow tier). Data are migrated between these two tiers according to their activity. In this article, we propose a Tier-aware Data Deduplication-based File System, called TDDFS, which can operate efficiently on top of a two-tier storage environment. Specifically, to achieve better performance, nearly all file operations are performed in the fast tier. To achieve higher cost-effectiveness, files are migrated from the fast tier to the slow tier if they are no longer active, and this migration is done with data deduplication. The distinctiveness of our design is that it maintains the non-redundant (unique) chunks produced by data deduplication in both tiers if possible. When a file is reloaded (called a reloaded file) from the slow tier to the fast tier, if some data chunks of the file already exist in the fast tier, then the data migration of these chunks from the slow tier can be avoided. Our evaluation shows that TDDFS achieves close to the best overall performance among various file-tiering designs for two-tier storage systems.

查看原文本刊更多论文

随着产生的数据量的快速增长和新型存储设备的开发，存储分层仍然是实现性能和成本效益之间良好权衡的流行方法。在基本的两层存储系统中，使用性能较高、成本较高的存储层(快层)存储访问频繁的(活动)数据，而将大量不太活跃的数据存储在性能较低、成本较低的存储层(慢层)中。数据根据这两个层的活动在它们之间迁移。在本文中，我们提出了一种基于分层的数据重复数据删除文件系统，称为TDDFS，它可以在两层存储环境之上高效地运行。具体来说，为了获得更好的性能，几乎所有的文件操作都在快速层执行。为了获得更高的成本效益，如果文件不再活跃，则将其从快速层迁移到慢速层，并且此迁移使用重复数据删除完成。我们设计的独特之处在于，如果可能的话，它会在两个层中维护由重复数据删除产生的非冗余(唯一)块。当文件从慢速存储层重新加载到快速存储层时，如果文件中的一些数据块已经存在于快速存储层中，则可以避免这些数据块从慢速存储层迁移。我们的评估表明，TDDFS在两层存储系统的各种文件分级设计中实现了接近最佳的总体性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量