2019 35th Symposium on Mass Storage Systems and Technologies (MSST)最新文献_第3页

A Performance Study of Lustre File System Checker: Bottlenecks and Potentials Lustre文件系统检查器的性能研究:瓶颈与潜力

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00-20

Dong Dai, Om Rameshwar Gatla, Mai Zheng

{"title":"A Performance Study of Lustre File System Checker: Bottlenecks and Potentials","authors":"Dong Dai, Om Rameshwar Gatla, Mai Zheng","doi":"10.1109/MSST.2019.00-20","DOIUrl":"https://doi.org/10.1109/MSST.2019.00-20","url":null,"abstract":"Lustre, as one of the most popular parallel file systems in high-performance computing (HPC), provides POSIX interface and maintains a large set of POSIX-related metadata, which could be corrupted due to hardware failures, software bugs, configuration errors, etc. The Lustre file system checker (LFSCK) is the remedy tool to detect metadata inconsistencies and to restore a corrupted Lustre to a valid state, hence is critical for reliable HPC. Unfortunately, in practice, LFSCK runs slow in large deployment, making system administrators reluctant to use it as a routine maintenance tool. Consequently, cascading errors may lead to unrecoverable failures, resulting in significant downtime or even data loss. Given the fact that HPC is rapidly marching to Exascale and much larger Lustre file systems are being deployed, it is critical to understand the performance of LFSCK. In this paper, we study the performance of LFSCK to identify its bottlenecks and analyze its performance potentials. Specifically, we design an aging method based on real-world HPC workloads to age Lustre to representative states, and then systematically evaluate and analyze how LFSCK runs on such an aged Lustre via monitoring the utilization of various resources. From our experiments, we find out that the design and implementation of LFSCK is sub-optimal. It consists of scalability bottleneck on the metadata server (MDS), relatively high fan-out ratio in network utilization, and unnecessary blocking among internal components. Based on these observations, we discussed potential optimization and present some preliminary results.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114781121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

SES-Dedup: a Case for Low-Cost ECC-based SSD Deduplication SES-Dedup:基于ecc的SSD低成本重复数据删除案例

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00009

Zhichao Yan, Hong Jiang, Song Jiang, Yujuan Tan, Hao Luo

{"title":"SES-Dedup: a Case for Low-Cost ECC-based SSD Deduplication","authors":"Zhichao Yan, Hong Jiang, Song Jiang, Yujuan Tan, Hao Luo","doi":"10.1109/MSST.2019.00009","DOIUrl":"https://doi.org/10.1109/MSST.2019.00009","url":null,"abstract":"Integrating the data deduplication function into Solid State Drives (SSDs) helps avoid writing duplicate contents to NAND flash chips, which will not only effectively reduce the number of Program/Erase (P/E) operations to extend the device's lifespan but also proportionally enlarge the effective capacity of SSD to improve the performance of its behind-the-scenes maintenance tasks such as wear-leveling (WL) and garbage-collection (GC). However, these benefits of deduplication come at a non-trivial computational cost incurred by the embedded SSD controller to compute cryptographic hashes. To address this overhead problem, some researchers have suggested replacing cryptographic hashes with error correction codes (ECCs) already embedded in the SSD chips to detect the duplicate contents. However, all existing attempts have ignored the impact of the data randomization (scrambler) module that is widely used in modern SSDs, thus making it impractical to directly integrate ECC-based deduplication into commercial SSDs. In this work, we revisit SSD's internal structure and propose the first deduplicatable SSD that can bypass the data scrambler module to enable the low-cost ECC-based data deduplication. Specifically, we propose two design solutions, one on the host side and the other on the device side, to enable ECC-based deduplication. Based on our approach, we can effectively exploit SSD's built-in ECC module to calculate the hash values of stored data for data deduplication. We have evaluated our SES-Dedup approach by replaying data traces in an SSD simulator and found that it can remove up to 30.8% redundant data with up to 17.0% write performance improvement over the baseline SSD.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115537264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

FastBuild: Accelerating Docker Image Building for Efficient Development and Deployment of Container FastBuild:加速Docker映像构建，用于高效开发和部署容器

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00-18

Zhuo Huang, Song Wu, Song Jiang, Hai Jin

{"title":"FastBuild: Accelerating Docker Image Building for Efficient Development and Deployment of Container","authors":"Zhuo Huang, Song Wu, Song Jiang, Hai Jin","doi":"10.1109/MSST.2019.00-18","DOIUrl":"https://doi.org/10.1109/MSST.2019.00-18","url":null,"abstract":"Docker containers have been increasingly adopted on various computing platforms to provide a lightweight virtualized execution environment. Compared to virtual machines, this technology can often reduce the launch time from a few minutes to less than 10 seconds, assuming the Docker image has been locally available. However, Docker images are highly customizable, and are mostly built at runtime from a remote base image by running instructions in a script (the Dockerfile). During the instruction execution, a large number of input files may have to be retrieved via the Internet. The image building may be an iterative process as one may need to repeatedly modify the Dockerfile until a desired image composition is received. In the process, every input file required by an instruction has to be remotely retrieved, even if it has been recently downloaded. This can make the process of building of an image and launching of a container unexpectedly slow. To address the issue, we propose a technique, named FastBuild, that maintains a local file cache to minimize the expensive file downloading. By non-intrusively intercepting remote file requests, and supplying files locally, FastBuild enables file caching in a manner transparent to image building. To further accelerate the image building, FastBuild overlaps operations of instructions' execution and writing intermediate image layers to the disk. We have implemented FastBuild. And experiments with images and Dockerfiles obtained from Docker Hub show that the system can improve building speed by up to 10 times, and reduce downloaded data by 72%.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130138066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Title Page iii 第三页标题

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/msst.2019.00002

引用次数: 0

Metadedup: Deduplicating Metadata in Encrypted Deduplication via Indirection Metadedup:间接删除加密重复数据删除中的元数据

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00007

Jingwei Li, P. Lee, Yanjing Ren, Xiaosong Zhang

{"title":"Metadedup: Deduplicating Metadata in Encrypted Deduplication via Indirection","authors":"Jingwei Li, P. Lee, Yanjing Ren, Xiaosong Zhang","doi":"10.1109/MSST.2019.00007","DOIUrl":"https://doi.org/10.1109/MSST.2019.00007","url":null,"abstract":"Encrypted deduplication combines encryption and deduplication in a seamless way to provide confidentiality guarantees for the physical data in deduplication storage, yet it incurs substantial metadata storage overhead due to the additional storage of keys. We present a new encrypted deduplication storage system called Metadedup, which suppresses metadata storage by also applying deduplication to metadata. Its idea builds on indirection, which adds another level of metadata chunks that record metadata information. We find that metadata chunks are highly redundant in real-world workloads and hence can be effectively deduplicated. In addition, metadata chunks can be protected under the same encrypted deduplication framework, thereby providing confidentiality guarantees for metadata as well. We evaluate Metadedup through microbenchmarks, prototype experiments, and trace-driven simulation. Metadedup has limited computational overhead in metadata processing, and only adds 6.19% of performance overhead on average when storing files in a networked setting. Also, for real-world backup workloads, Metadedup saves the metadata storage by up to 97.46% at the expense of only up to 1.07% of indexing overhead for metadata chunks.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115988406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

BFO: Batch-File Operations on Massive Files for Consistent Performance Improvement BFO:对海量文件进行批处理文件操作，以实现一致的性能改进

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00-17

Yang Yang, Q. Cao, Hong Jiang, Li Yang, Jie Yao, Yuanyuan Dong, Puyuan Yang

{"title":"BFO: Batch-File Operations on Massive Files for Consistent Performance Improvement","authors":"Yang Yang, Q. Cao, Hong Jiang, Li Yang, Jie Yao, Yuanyuan Dong, Puyuan Yang","doi":"10.1109/MSST.2019.00-17","DOIUrl":"https://doi.org/10.1109/MSST.2019.00-17","url":null,"abstract":"Existing local file systems, designed to support a typical single-file access pattern only, can lead to poor performance when accessing a batch of files, especially small files. This single-file pattern essentially serializes accesses to batched files one by one, resulting in a large number of non-sequential, random, and often dependent I/Os between file data and metadata at the storage ends. We first experimentally analyze the root cause of such inefficiency in batch-file accesses. Then, we propose a novel batch-file access approach, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes respectively, using a two-phase access for metadata and data jointly. The BFO offers dedicated interfaces for batch-file accesses and additional processes integrated into existing file systems without modifying their structures and procedures. We implement a BFO prototype on ext4, one of the most popular file systems. Our evaluation results show that the batch-file read and write performances of BFO are consistently higher than those of the traditional approaches regardless of access patterns, data layouts, and storage media, with synthetic and real-world file sets. BFO improves the read performance by up to 22.4× and 1.8× with HDD and SSD respectively; and boosts the write performance by up to 111.4× and 2.9× with HDD and SSD respectively. BFO also demonstrates consistent performance advantages when applied to four representative applications, Linux cp, Tar, GridFTP, and Hadoop.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128030916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Scalable QoS for Distributed Storage Clusters using Dynamic Token Allocation 使用动态令牌分配的分布式存储集群的可扩展QoS

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00-19

Yuhan Peng, Qingyue Liu, P. Varman

{"title":"Scalable QoS for Distributed Storage Clusters using Dynamic Token Allocation","authors":"Yuhan Peng, Qingyue Liu, P. Varman","doi":"10.1109/MSST.2019.00-19","DOIUrl":"https://doi.org/10.1109/MSST.2019.00-19","url":null,"abstract":"The paper addresses the problem of providing performance QoS guarantees in a clustered storage system. Multiple related storage objects are grouped into logical containers called buckets, which are distributed over the servers based on the placement policies of the storage system. QoS is provided at the level of buckets. The service credited to a bucket is the aggregate of the IOs received by its objects at all the servers. The service depends on individual time-varying demands and congestion at the servers. We present a token-based, coarse-grained approach to providing IO reservations and limits to buckets. We propose pShift, a novel token allocation algorithm that works in conjunction with token-sensitive scheduling at each server to control the aggregate IOs received by each bucket on multiple servers. pShift determines the optimal token distribution based on the estimated bucket demands and server IOPS capacities. Compared to existing approaches, pShift has far smaller overhead, and can be accelerated using parallelization and approximation. Our experimental results show that pShift provides accurate QoS among the buckets with different access patterns, and handles runtime demand changes well.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126282600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Efficient Encoding and Reconstruction of HPC Datasets for Checkpoint/Restart 检查点/重启HPC数据集的高效编码和重构

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00-14

Jialing Zhang, Xiaoyan Zhuo, Aekyeung Moon, Hang Liu, S. Son

{"title":"Efficient Encoding and Reconstruction of HPC Datasets for Checkpoint/Restart","authors":"Jialing Zhang, Xiaoyan Zhuo, Aekyeung Moon, Hang Liu, S. Son","doi":"10.1109/MSST.2019.00-14","DOIUrl":"https://doi.org/10.1109/MSST.2019.00-14","url":null,"abstract":"As the amount of data produced by HPC applications reaches the exabyte range, compression techniques are often adopted to reduce the checkpoint time and volume. Since lossless techniques are limited in their ability to achieve appreciable data reduction, lossy compression becomes a preferable option. In this work, a lossy compression technique with highly efficient encoding, purpose-built error control, and high compression ratios is proposed. Specifically, we apply a discrete cosine transform with a novel block decomposition strategy directly to double-precision floating point datasets instead of prevailing prediction-based techniques. Further, we design an adaptive quantization with two specific task-oriented quantizers: guaranteed error bounds and higher compression ratios. Using real-world HPC datasets, our approach achieves 3x-38x compression ratios while guaranteeing specified error bounds, showing comparable performance with state-of-the-art lossy compression methods, SZ and ZFP. Moreover, our method provides viable reconstructed data for various checkpoint/restart scenarios in the FLASH application, thus is considered to be a promising approach for lossy data compression in HPC I/O software stacks.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132895879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Parity-Only Caching for Robust Straggler Tolerance 鲁棒掉队容忍度的纯奇偶缓存

2019 35th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2019-05-01 DOI: 10.1109/MSST.2019.00006

Mi Zhang, Qiuping Wang, Zhirong Shen, P. Lee

{"title":"Parity-Only Caching for Robust Straggler Tolerance","authors":"Mi Zhang, Qiuping Wang, Zhirong Shen, P. Lee","doi":"10.1109/MSST.2019.00006","DOIUrl":"https://doi.org/10.1109/MSST.2019.00006","url":null,"abstract":"Stragglers (i.e., nodes with slow performance) are prevalent and incur performance instability in large-scale storage systems, yet it is challenging to detect stragglers in practice. We make a case by showing how erasure-coded caching provides robust straggler tolerance without relying on timely and accurate straggler detection, while incurring limited redundancy overhead in caching. We first analytically motivate that caching only parity blocks can achieve effective straggler tolerance. To this end, we present POCache, a parity-only caching design that provides robust straggler tolerance. To limit the erasure coding overhead, POCache slices blocks into smaller subblocks and parallelizes the coding operations at the subblock level. Also, it leverages a straggler-aware cache algorithm that takes into account both file access popularity and straggler estimation to decide which parity blocks should be cached. We implement a POCache prototype atop Hadoop 3.1 HDFS, while preserving the performance and functionalities of normal HDFS operations. Our extensive experiments on both local and Amazon EC2 clusters show that in the presence of stragglers, POCache can reduce the read latency by up to 87.9% compared to vanilla HDFS.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133961285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5