{"title":"Red: An efficient replacement algorithm based on REsident Distance for exclusive storage caches","authors":"Yingjie Zhao, Nong Xiao, Fang Liu","doi":"10.1109/MSST.2010.5496988","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496988","url":null,"abstract":"This paper presents our replacement algorithm named RED for storage caches. RED is exclusive. It can eliminate the duplications between a storage cache and its client cache. RED is high performance. A new criterion Resident Distance is proposed for making an efficient replacement decision instead of Recency and Frequency. Moreover, RED is non-intrusive to a storage client. It does not need to change client software and could be used in a real-life system. Previous work on the management of a storage cache can attain one or two of above benefits, but not all of them. We have evaluated the performance of RED by using simulations with both synthetic and real-life traces. The simulation results show that RED significantly outperforms LRU, ARC, MQ, and is better than DEMOTE, PROMOTE for a wide range of cache sizes.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124922828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BPAC: An adaptive write buffer management scheme for flash-based Solid State Drives","authors":"Guanying Wu, B. Eckart, Xubin He","doi":"10.1109/MSST.2010.5496998","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496998","url":null,"abstract":"Solid State Drives (SSD's) have shown promise to be a candidate to replace traditional hard disk drives, but due to certain physical characteristics of NAND flash, there are some challenging areas of improvement and further research. We focus on the layout and management of the small amount of RAM that serves as a cache between the SSD and the system that uses it. Of the techniques that have previously been proposed to manage this cache, we identify several sources of inefficient cache space management due to the way pages are clustered in blocks and the limited replacement policy. We develop a hybrid page/block architecture along with an advanced replacement policy, called BPAC, or Block-Page Adaptive Cache, to exploit both temporal and spatial locality. Our technique involves adaptively partitioning the SSD on-disk cache to separately hold pages with high temporal locality in a page list and clusters of pages with low temporal but high spatial locality in a block list. We run trace-driven simulations to verify our design and find that it outperforms other popular flash-aware cache schemes under different workloads.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126132931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MAD2: A scalable high-throughput exact deduplication approach for network backup services","authors":"Jiansheng Wei, Hong Jiang, Ke Zhou, D. Feng","doi":"10.1109/MSST.2010.5496987","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496987","url":null,"abstract":"Deduplication has been widely used in disk-based secondary storage systems to improve space efficiency. However, there are two challenges facing scalable high-throughput deduplication storage. The first is the duplicate-lookup disk bottleneck due to the large size of data index that usually exceeds the available RAM space, which limits the deduplication throughput. The second is the storage node island effect resulting from duplicate data among multiple storage nodes that are difficult to eliminate. Existing approaches fail to completely eliminate the duplicates while simultaneously addressing the challenges. This paper proposes MAD2, a scalable high-throughput exact deduplication approach for network backup services. MAD2 eliminates duplicate data both at the file level and at the chunk level by employing four techniques to accelerate the deduplication process and evenly distribute data. First, MAD2 organizes fingerprints into a Hash Bucket Matrix (HBM), whose rows can be used to preserve the data locality in backups. Second, MAD2 uses Bloom Filter Array (BFA) as a quick index to quickly identify non-duplicate incoming data objects or indicate where to find a possible duplicate. Third, Dual Cache is integrated in MAD2 to effectively capture and exploit data locality. Finally, MAD2 employs a DHT-based Load-Balance technique to evenly distribute data objects among multiple storage nodes in their backup sequences to further enhance performance with a well-balanced load. We evaluate our MAD2 approach on the backend storage of B-Cloud, a research-oriented distributed system that provides network backup services. Experimental results show that MAD2 significantly outperforms the state-of-the-art approximate deduplication approaches in terms of deduplication efficiency, supporting a deduplication throughput of at least 100MB/s for each storage component.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122485808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Delayed partial parity scheme for reliable and high-performance flash memory SSD","authors":"Soojun Im, Dongkun Shin","doi":"10.1109/MSST.2010.5496997","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496997","url":null,"abstract":"The I/O performances of flash memory solidstate disks (SSDs) are increasing by exploiting parallel I/O architectures. However, the reliability problem is a critical issue in building a large-scale flash storage. We propose a novel Redundant Arrays of Inexpensive Disks (RAID) architecture which uses the delayed parity update and partial parity caching techniques for reliable and high-performance flash memory SSDs. The proposed techniques improve the performance of the RAID-5 SSD by 38% and 30% on average in comparison to the original RAID-5 technique and the previous delayed parity update technique, respectively.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127961390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Amer, D. Long, E. L. Miller, Jehan-Francois Pâris, T. Schwarz
{"title":"Design issues for a shingled write disk system","authors":"A. Amer, D. Long, E. L. Miller, Jehan-Francois Pâris, T. Schwarz","doi":"10.1109/MSST.2010.5496991","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496991","url":null,"abstract":"If the data density of magnetic disks is to continue its current 30–50% annual growth, new recording techniques are required. Among the actively considered options, shingled writing is currently the most attractive one because it is the easiest to implement at the device level. Shingled write recording trades the inconvenience of the inability to update in-place for a much higher data density by a using a different write technique that overlaps the currently written track with the previous track. Random reads are still possible on such devices, but writes must be done largely sequentially. In this paper, we discuss possible changes to disk-based data structures that the adoption of shingled writing will require. We first explore disk structures that are optimized for large sequential writes with little or no sequential writing, even of metadata structures, while providing acceptable read performance. We also examine the usefulness of non-volatile RAM and the benefits of object-based interfaces in the context of shingled disks. Finally, through the analysis of recent device traces, we demonstrate the surprising stability of written device blocks, with general purpose workloads showing that more than 93% of device blocks remain unchanged over a day, and that for more specialized workloads less than 0.5% of a shingled-write disk's capacity would be needed to hold randomly updated blocks.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122812542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy and thermal aware buffer cache replacement algorithm","authors":"Jianhui Yue, Yifeng Zhu, Zhao Cai, Lin Lin","doi":"10.1109/MSST.2010.5496982","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496982","url":null,"abstract":"Power consumption is an increasingly impressing concern for data servers as it directly affects running costs and system reliability. Prior studies have shown most memory space on data servers are used for buffer caching and thus cache replacement becomes critical. Temporally concentrating memory accesses to a smaller set of memory chips increases the chances of free riding through DMA overlapping and also enlarges the opportunities for other ranks to power down. This paper proposes a power and thermal-aware buffer cache replacement algorithm. It conjectures that the memory rank that holds the most amount of cold blocks are very likely to be accessed in the near future. Choosing the victim block from this rank can help reduce the number of memory ranks that are active simultaneously. We use three real-world I/O server traces, including TPC-C, LM-TBF and MSN-BEFS to evaluate our algorithm. Experimental results show that our algorithm can save up to 27% energy than LRU and reduce the temperature of memory up to 5.45°C with little or no performance degradation.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115440079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Choulseung Hyun, Jongmoo Choi, Y. Oh, Donghee Lee, Eunsam Kim, S. Noh
{"title":"A performance model and file system space allocation scheme for SSDs","authors":"Choulseung Hyun, Jongmoo Choi, Y. Oh, Donghee Lee, Eunsam Kim, S. Noh","doi":"10.1109/MSST.2010.5496986","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496986","url":null,"abstract":"Solid State Drives (SSDs) are now becoming a part of main stream computers. Even though disk scheduling algorithms and file systems of today have been optimized to exploit the characteristics of hard drives, relatively little attention has been paid to model and exploit the characteristics of SSDs. In this paper, we consider the use of SSDs from the file system standpoint. To do so, we derive a performance model for the SSDs. Based on this model, we devise a file system space allocation scheme, which we call Greedy-Space, for block or hybrid mapping SSDs. From the Postmark benchmark results, we observe substantial performance improvements when employing the Greedy-Space scheme in ext3 and Reiser file systems running on three SSDs available in the market.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125061261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Observations made while running a multi-petabyte storage system","authors":"M. Santos, Dennis Waldron","doi":"10.1109/MSST.2010.5496984","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496984","url":null,"abstract":"We take an overview of the CERN Advanced Storage (CASTOR) version 2 system and its usage at CERN while serving the High Energy Physics community. We further explore some of the observations made between 2005 and 2010 while managing this multi-petabyte distributed storage system.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126815942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Allalouf, Muli Ben-Yehuda, J. Satran, Itai Segall
{"title":"Block storage listener for detecting file-level intrusions","authors":"M. Allalouf, Muli Ben-Yehuda, J. Satran, Itai Segall","doi":"10.1109/MSST.2010.5496974","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496974","url":null,"abstract":"An intrusion detection system (IDS) is usually located and operated at the host, where it captures local suspicious events, or at an appliance that listens to the network activity. Providing an online IDS to the storage controller is essential for dealing with compromised hosts or coordinated attacks by multiple hosts. SAN block storage controllers are connected to the world via block-level protocols, such as iSCSI and Fibre Channel. Usually, block-level storage systems do not maintain information specific to the file-system using them. The range of threats that can be handled at the block level is limited. A file system view at the controller, together with the knowledge of which arriving block belongs to which file or inode, will enable the detection of file-level threats. In this paper, we present IDStor, an IDS for block-based storage. IDStor acts as a listener to storage traffic, out of the controller's I/O path, and is therefore attractive for integration into existing SAN-based storage solutions. IDStor maintains a block-to-file mapping that is updated online. Using this mapping, IDStor infers the semantics of file-level commands from the intercepted block-level operations, thereby detecting file-level intrusions by merely observing the block read and write commands passing between the hosts and the controller.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124005766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deferred updates for flash-based storage","authors":"Biplob K. Debnath, M. Mokbel, D. Lilja, D. Du","doi":"10.1109/MSST.2010.5496994","DOIUrl":"https://doi.org/10.1109/MSST.2010.5496994","url":null,"abstract":"The NAND flash memory based storage has faster read, higher power savings, and lower cooling cost compared to the conventional rotating magnetic disk drive. However, in case of flash memory, read and write operations are not symmetric. Write operations are much slower than read operations. Moreover, frequent update operations reduce the lifetime of the flash memory. Due to the faster read performance, flash-based storage is particularly attractive for the read-intensive database workloads, while it can produce poor performance when used for the update-intensive database workloads. This paper aims to improve write performance and lifetime of flash-based storage for the update-intensive workloads. In particular, we propose a new hierarchical approach named as deferred update methodology. Instead of directly updating the data records, first we buffer the changes due to update operations as logs in two intermediate in-flash layers. Next, we apply multiple update logs in bulk to the data records. Experimental results show that our proposed methodology significantly improves update processing overhead and longevity of the flash-based storages.","PeriodicalId":350968,"journal":{"name":"2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116723587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}