{"title":"Understanding storage I/O behaviors of mobile applications","authors":"Jace Courville, F. Chen","doi":"10.1109/MSST.2016.7897092","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897092","url":null,"abstract":"In the past few years, mobile devices quickly gained high popularity in our daily life. Designed for ultra-mobility, these small yet powerful devices are fundamentally distinct from traditional computer systems (e.g., PCs and servers) - from the internal hardware architecture and software stack, to application behaviors. Storage, the slowest component in the I/O stack, plays an important role in mobile systems and can greatly affect user experience. In this paper, we present a set of comprehensive experimental studies on mobile storage and attempt to gain insight on the unique behaviors of mobile applications and characterize the performance properties of underlying mobile storage. In our experiments, we carefully selected 13 representative mobile workloads from 5 different categories. Our studies reveal several unexpected observations on mobile storage. Based on these findings, we further discuss the associated implications to mobile systems and application designers. We hope this work can inspire system architects, application designers, and practitioners to pay specific attention to the high-latency I/O operations, rather than completely relying on the default APIs. We also suggest a further look to new opportunities, such as adopting a faster medium in the mobile system architecture, for future research.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129379652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinhua Cui, Weiguo Wu, Xingjun Zhang, Jianhang Huang, Yinfeng Wang
{"title":"Exploiting latency variation for access conflict reduction of NAND flash memory","authors":"Jinhua Cui, Weiguo Wu, Xingjun Zhang, Jianhang Huang, Yinfeng Wang","doi":"10.1109/MSST.2016.7897088","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897088","url":null,"abstract":"NAND flash memory has been widely used in storage systems by offering greater read/write performance and lower power consumption than mechanical hard drives. Recently, the tradeoff between endurance, write speed, and read speed has been exploited from many ways for I/O performance improvement, which also induce the read/write latency variation. In this paper, the latency variation is exploited in I/O scheduling for access characteristic guided read and write latency minimization. First, with the understanding of the relationship among read latency, write latency and raw bit error rates (RBER), different ways to exploit the relationship for read and write latency reduction is discussed. Then, an I/O scheduling scheme is proposed by using hotness and retention age of accessed data to determine the speed of writes or reads, giving scheduling priority to fast writes and fast reads for conflict reduction. Experiments with various traces reveal that the proposed technique achieves significant read and write performance improvements.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115523422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sorted deduplication: How to process thousands of backup streams","authors":"J. Kaiser, Tim Süß, Lars Nagel, A. Brinkmann","doi":"10.1109/MSST.2016.7897082","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897082","url":null,"abstract":"The requirements of deduplication systems have changed in the last years. Early deduplication systems had to process dozens to hundreds of backup streams at the same time while today they are able to process hundreds to thousands of them. Traditional approaches rely on stream-locality, which supports parallelism, but which easily leads to many non-contiguous disk accesses, as each stream competes with all other streams for the available resources. This paper presents a new exact deduplication approach designed for processing thousands of backup streams at the same time on the same fingerprint index. The underlying approach destroys the traditionally exploited temporal chunk locality and creates a new one by sorting fingerprints. The sorting leads to perfectly sequential disk access patterns on the backup servers, while only slightly increasing the load on the clients. In our experiments, the new approach generates up to 113 times less I/Os than the exact Data Domain deduplication file system and up to 12 times less I/Os than the approximate Sparse Indexing, while consuming less memory at the same time.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125189391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pfimbi: Accelerating big data jobs through flow-controlled data replication","authors":"Simbarashe Dzinamarira, Florin Dinu, T. Ng","doi":"10.1109/MSST.2016.7897074","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897074","url":null,"abstract":"The performance of HDFS is critical to big data software stacks and has been at the forefront of recent efforts from the industry and the open source community. A key problem is the lack of flexibility in how data replication is performed. To address this problem, this paper presents Pfimbi, the first alternative to HDFS that supports both synchronous and flow-controlled asynchronous data replication. Pfimbi has numerous benefits: It accelerates jobs, exploits under-utilized storage I/O bandwidth, and supports hierarchical storage I/O bandwidth allocation policies. We demonstrate that for a job trace derived from a Facebook workload, Pfimbi improves the average job runtime by 18% and by up to 46% in the best case. We also demonstrate that flow control is crucial to fully exploiting the benefits of asynchronous replication; removing Pfimbi's flow control mechanisms resulted in a 2.7× increase in job runtime.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128434200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suli Yang, Kiran Srinivasan, K. Udayashankar, S. Krishnan, J. Feng, Yupu Zhang, A. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
{"title":"Tombolo: Performance enhancements for cloud storage gateways","authors":"Suli Yang, Kiran Srinivasan, K. Udayashankar, S. Krishnan, J. Feng, Yupu Zhang, A. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau","doi":"10.1109/MSST.2016.7897076","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897076","url":null,"abstract":"Object-based cloud storage has been widely adopted for their agility in deploying storage with a very low up-front cost. However, enterprises currently use them to store secondary data and not for expensive primary data. The driving reason is performance; most enterprises conclude that storing primary data in the cloud will not deliver the performance needed to serve typical workloads. Our analysis of real-world traces shows that certain primary data sets can reside in the cloud with its working set cached locally, using a cloud gateway that acts as a caching bridge between local data centers and the cloud. We use a realistic cloud gateway simulator to study the performance and cost of moving such workloads to different cloud backends (like Amazon S3). We find that when equipped with the right techniques, cloud gateways can provide competitive performance and price compared to on-premise storage. We also provide insights on how to build such cloud gateways, especially with respect to caching and prefetching techniques.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129328151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytic models for flash-based SSD performance when subject to trimming","authors":"Robin Verschoren, B. V. Houdt","doi":"10.1109/MSST.2016.7897086","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897086","url":null,"abstract":"Garbage collection is known to have a profound impact on SSD performance as it strongly influences the write amplification. Another key value that impacts the write amplification is the amount of over-provisioning, which lowers the write amplification at the cost of reducing the user-visible storage capacity. Write amplification occurs as the valid pages that remain on a block selected by garbage collection need to be copied to a free block before an erasure can take place. Some of these valid pages may however belong to a deleted file and therefore copying them is redundant. To avoid copying deleted data, the operating system can issue a Trim command to invalidate pages whenever a file is deleted. Prior analytical studies on the write amplification in SSDs assumed that no trimming takes place. In this paper we generalize a number of mean field models to assess the impact of trimming on the write amplification. Using these models we argue that the write amplification in a (large) system with trimming can be determined by analyzing a system without trimming that uses a larger over-provisioning factor (and modified hot fraction, in case of hot and cold data). Using numerical results we further show that trimming cold data results in a more significant reduction in the write amplification compared to trimming hot data.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132303196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shengan Zheng, Linpeng Huang, H. Liu, L. Wu, Jin-hua Zha
{"title":"HMVFS: A Hybrid Memory Versioning File System","authors":"Shengan Zheng, Linpeng Huang, H. Liu, L. Wu, Jin-hua Zha","doi":"10.1109/MSST.2016.7897079","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897079","url":null,"abstract":"The byte-addressable Non-Volatile Memory (NVM) offers fast, fine-grained access to persistent storage, and a large volume of recent researches are conducted on developing NVM-based in-memory file systems. However, existing approaches focus on low-overhead access to the memory and only guarantee the consistency between data and metadata. In this paper, we address the problem of maintaining consistency among continuous snapshots for NVM-based in-memory file systems. We propose a Hybrid Memory Versioning File System (HMVFS) that achieves fault tolerance efficiently and has low impact on I/O performance. Our results show that HMVFS provides better performance on snapshotting compared with the traditional versioning file systems for many workloads. Specifically, HMVFS has lower snapshotting overhead than BTRFS and NILFS2, improving by a factor of 9.7 and 6.6, respectively. Furthermore, HMVFS imposes minor performance overhead compared with the state-of-the-art in-memory file systems like PMFS.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134563941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Chen, Jun Yang, Q. Wei, Chundong Wang, Mingdi Xue
{"title":"Fine-grained metadata journaling on NVM","authors":"Cheng Chen, Jun Yang, Q. Wei, Chundong Wang, Mingdi Xue","doi":"10.1109/MSST.2016.7897077","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897077","url":null,"abstract":"Journaling file systems have been widely used where data consistency must be assured. However, we observed that the overhead of journaling can cause up to 48.2% performance drop under certain kinds of workloads. On the other hand, the emerging high-performance, byte-addressable Non-volatile Memory (NVM) has the potential to minimize such overhead by being used as the journal device. The traditional journaling mechanism based on block devices is nevertheless unsuitable for NVM due to the write amplification of metadata journal we observed. In this paper, we propose a fine-grained metadata journal mechanism to fully utilize the low-latency byte-addressable NVM so that the overhead of journaling can be significantly reduced. Based on the observation that conventional block-based metadata journal contains up to 90% clean metadata that is unnecessary to be journalled, we design a fine-grained journal format for byte-addressable NVM which contains only modified metadata. Moreover, we redesign the process of transaction committing, checkpointing and recovery in journaling file systems utilizing the new journal format. Therefore, thanks to the reduced amount of ordered writes to NVM, the overhead of journaling can be reduced without compromising the file system consistency. Experimental results show that our NVM-based fine-grained metadata journaling is up to 15.8× faster than the traditional approach under FileBench workloads.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123685087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"File system trace replay methods through the lens of metrology","authors":"T. Pereira, F. Brasileiro, Lívia M. R. Sampaio","doi":"10.1109/MSST.2016.7897090","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897090","url":null,"abstract":"There are various methods to evaluate the performance of file systems through the replay of file system traces. Despite this diversity, little attention was given on comparing the alternatives, thus bringing some skepticism about the results attained using these methods. In this paper, to fill this understanding gap, we analyze two popular trace replay methods through the lens of metrology. This case study indicates that the evaluated methods provide similar, good precision but are biased in some scenarios. Our results identified limitations in the implementation of the replay tool as well as flaws in the established practices to experiment with trace replayers as the root causes of the measurement bias. After improving the implementation of the trace replayer and discarding inappropriate experimental practices, we were able to reduce the bias, leading to lower measurement uncertainty. Finally, our case study also shows that, in some cases, collecting only the file system activity is not enough to accurately replay the traces; in these cases, collecting resource consumption information, such as the amount of allocated memory, can improve the quality of trace replay methods.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131225347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive policies for balancing performance and lifetime of mixed SSD arrays through workload sampling","authors":"Sangwhan Moon, A. Reddy","doi":"10.1109/MSST.2016.7897084","DOIUrl":"https://doi.org/10.1109/MSST.2016.7897084","url":null,"abstract":"Solid-state drives (SSDs) have become promising storage components to serve large I/O demands in modern storage systems. Enterprise class (high-end) SSDs are faster and more resilient than client class (low-end) SSDs but they are expensive to be deployed in large scale storage systems. It is an attractive and practical alternative to exploit the high-end SSDs as a cache and low-end SSDs as main storage.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114459121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}