{"title":"Characterizing the Dependability of Distributed Storage Systems Using a Two-Layer Hidden Markov Model-Based Approach","authors":"Xin Chen, James Warren, Fang Han, Xubin He","doi":"10.1109/NAS.2010.28","DOIUrl":null,"url":null,"abstract":"Nowadays, dependability is of paramount importance in modern distributed storage systems. A challenging issue to deploy a storage system with certain dependability requirements or improve existing systems' dependability is how to comprehensively and efficiently characterize the dependability of those systems. In this paper, we present a two-layer Hidden Markov Model (HMM) to characterize the dependability of a distributed storage system, focusing on the layer of parallel file system. By training the model with observable measurements under faulty scenarios, such as I/O performance, we quantify the system dependability via a tuple of state transition probability, service degradation, and fault latency under those scenarios. Our experimental results on a distributed storage system with PVFS (Parallel Virtual File System) demonstrate the effectiveness of our HMM-based approach, which efficiently captures the behavior patterns of the target system under disk faults and memory overusage.","PeriodicalId":284549,"journal":{"name":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2010.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Nowadays, dependability is of paramount importance in modern distributed storage systems. A challenging issue to deploy a storage system with certain dependability requirements or improve existing systems' dependability is how to comprehensively and efficiently characterize the dependability of those systems. In this paper, we present a two-layer Hidden Markov Model (HMM) to characterize the dependability of a distributed storage system, focusing on the layer of parallel file system. By training the model with observable measurements under faulty scenarios, such as I/O performance, we quantify the system dependability via a tuple of state transition probability, service degradation, and fault latency under those scenarios. Our experimental results on a distributed storage system with PVFS (Parallel Virtual File System) demonstrate the effectiveness of our HMM-based approach, which efficiently captures the behavior patterns of the target system under disk faults and memory overusage.