{"title":"Data allocation strategies for the management of Quality of Service in Virtualised Storage Systems","authors":"F. Franciosi, W. Knottenbelt","doi":"10.1109/MSST.2011.5937229","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937229","url":null,"abstract":"The amount of data managed by organisations continues to grow relentlessly. Driven by the high costs of maintaining multiple local storage systems, there is a well established trend towards storage consolidation using multi-tier Virtualised Storage Systems (VSSs). At the same time, storage infrastructures are increasingly subject to stringent Quality of Service (QoS) demands. Within a VSS, it is challenging to match desired QoS with delivered QoS, considering the latter can vary dramatically both across and within tiers. Manual efforts to achieve this match require extensive and ongoing human intervention.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"307 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115617818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ZoneFS: Stripe remodeling in cloud data centers","authors":"Lanyue Lu, Dean Hildebrand, Renu Tewari","doi":"10.1109/MSST.2011.5937223","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937223","url":null,"abstract":"Cloud data centers will contain tens of thousands of servers with massive aggregate bandwidth requirements for generating, accessing, and analyzing immense amounts of data. The I/O requirements of the myriad applications that these data centers must support run the gamut from extreme IOPS intensive to extreme bandwidth intensive. Delivering high performance with unreliable commodity hardware for this range of workloads is truly a grand challenge. ZoneFS is a parallel file system that targets cloud data center infrastructures built up of commodity network switches. ZoneFS employs a highly-available and flexible storage architecture that divides a cluster switch hierarchy into zones and stripes data across servers and disks to maximize aggregate I/O throughput and avoid storage server hotspots. In this paper, we present the overall design and implementation of ZoneFS and evaluate its key features with several cloud computing workloads. Our experimental results show that ZoneFS can improve application runtime performance by up to 76% over standard parallel file systems and by up to 85% over Internet-scale file systems.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122571603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliability-aware energy management for hybrid storage systems","authors":"Wes Felter, A. Hylick, J. Carter","doi":"10.1109/MSST.2011.5937221","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937221","url":null,"abstract":"Modern disk-based storage systems are not energy proportional, because disks consume almost as much power when idle (but spinning) as they do when actively accessing data. We combine a power-aware, solid-state (flash) cache and a reliability-aware disk spindown mechanism to significantly improve storage energy proportionality without hurting disk reliability, data integrity, or performance. We evaluated the resulting power- and reliability-aware hybrid flash-disk RAID storage array and found that it reduces energy consumption by 85% compared to a similar-cost, similar-performance typical configuration of all SAS drives that are never spun down. Our design also achieves almost 50% energy savings compared to hybrid flash-disk systems tuned for performance or that do not take full advantage of opportunities for safe spindown. Further, unlike most previous work that exploits spindown to save energy, we limit the rate at which disks are spun down to avoid premature mechanical failures, whereas reliability-unaware spindown algorithms can exceed manufacturer waranteed lifetime spindown limits in as little as one year.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124043013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hot data identification for flash-based storage systems using multiple bloom filters","authors":"Dongchul Park, D. Du","doi":"10.1109/MSST.2011.5937216","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937216","url":null,"abstract":"Hot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). Although the hot data identification is an issue of paramount importance in flash memory, little investigation has been made. Moreover, all existing schemes focus almost exclusively on a frequency viewpoint. However, recency also must be considered equally with the frequency for effective hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a Window-based Direct Address Counting (WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm, using a sliding window concept, can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our multiple bloom filter-based scheme outperforms the state-of-the-art scheme. In particular, ours not only consumes 50% less memory and requires less computational overhead up to 58%, but also improves its performance up to 65%.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130113260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q. Wei, Bozhao Gong, Suraj Pathak, B. Veeravalli, Lingfang Zeng, K. Okada
{"title":"WAFTL: A workload adaptive flash translation layer with data partition","authors":"Q. Wei, Bozhao Gong, Suraj Pathak, B. Veeravalli, Lingfang Zeng, K. Okada","doi":"10.1109/MSST.2011.5937217","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937217","url":null,"abstract":"Current FTL schemes have inevitable limitations in terms of memory requirement, performance, garbage collection overhead, and scalability. To overcome these limitations, we propose a workload adaptive flash translation layer referred to as WAFTL. WAFTL explores either page-level or block-level address mapping for normal data block based on access patterns. Page Mapping Block (PMB) is used to store random data and handle large number of partial updates. Block Mapping Block (BMB) is utilized to store sequential data and lower overall mapping table. PMB or BMB is allocated on demand and the number of PMB or BMB eventually depends on workload. An efficient address mapping is designed to reduce overall mapping table and quickly conduct address translation. WAFTL explores a small part of flash space as Buffer Zone to log writes sequentially and migrate data into BMB or PMB based on threshold. Static and dynamic threshold setting are proposed to balance performance and mapping table size. WAFTL has been extensively evaluated under various enterprise workloads. Benchmark results conclusively demonstrate that proposed WAFTL is workload adaptive and achieves up to 80% performance improvement, 83% garbage collection overhead reduction and 50% mapping table reduction compared to existing FTL schemes.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133226351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance modeling and analysis of flash-based storage devices","authors":"H. H. Huang, Shan Li, A. Szalay, A. Terzis","doi":"10.1109/MSST.2011.5937213","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937213","url":null,"abstract":"Flash-based solid-state drives (SSDs) will become key components in future storage systems. An accurate performance model will not only help understand the state-of-the-art of SSDs, but also provide the research tools for exploring the design space of such storage systems. Although over the years many performance models were developed for hard drives, the architectural differences between two device families prevent these models from being effective for SSDs. The hard drive performance models cannot account for several unique characteristics of SSDs, e.g., low latency, slow update, and expensive block-level erase. In this paper, we utilize the black-box modeling approach to analyze and evaluate SSD performance, including latency, bandwidth, and throughput, as it requires minimal a priori information about the storage devices. We construct the black-box models, using both synthetic workloads and real-world traces, on three SSDs, as well as an SSD RAID. We find that, while the black-box approach may produce less desirable performance predictions for hard disks, a black-box SSD model with a comprehensive set of workload characteristics can produce accurate predictions for latency, bandwidth, and throughput with small errors.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116818904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Settlemyer, J. Dobson, S. Hodson, J. Kuehn, S. Poole, T. Ruwart
{"title":"A technique for moving large data sets over high-performance long distance networks","authors":"B. Settlemyer, J. Dobson, S. Hodson, J. Kuehn, S. Poole, T. Ruwart","doi":"10.1109/MSST.2011.5937236","DOIUrl":"https://doi.org/10.1109/MSST.2011.5937236","url":null,"abstract":"In this paper we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although performance studies of wide area networks have been a frequent topic of interest, performance analyses have tended to focus on network latency characteristics and peak throughput using network traffic generators. In this study we instead perform an end-to-end long distance networking analysis that includes reading large data sets from a source file system and committing the data to a remote destination file system. An evaluation of end-to-end data movement is also an evaluation of the system configurations employed and the tools used to move the data. For this paper, we have built several storage platforms and connected them with a high performance long distance network configuration. We use these systems to analyze the capabilities of three data movement tools: BBcp, GridFTP, and XDD. Our studies demonstrate that existing data movement tools do not provide efficient performance levels or exercise the storage devices in their highest performance modes.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127366713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}