{"title":"Reliable cluster computing with a new checkpointing RAID-x architecture","authors":"K. Hwang, Hai Jin, Roy S. C. Ho, Wonwoo Ro","doi":"10.1109/HCW.2000.843742","DOIUrl":null,"url":null,"abstract":"In a serverless cluster of PCs or workstations, the cluster must allow remote file accesses or parallel I/O directly performed over disks distributed to all client nodes. We introduce a new distributed disk array, called the RAID-x, for use in serverless clusters. The RAID-x architecture is based on an orthogonal striping and mirroring (OSM) scheme, which exploits full-bandwidth and protects the system from all single disk failures. The performance of the RAID-x is experimentally proven superior to RAID-1 and NFS in the Linux cluster environment. We propose a new striped checkpointing scheme, leveraging on striped parallelism and pipelined writing of successive disk stripes. This RAID-x architecture greatly enhances the throughput, reliability, and availability of scalable clusters. It appeals especially to I/O-centric cluster applications.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HCW.2000.843742","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
In a serverless cluster of PCs or workstations, the cluster must allow remote file accesses or parallel I/O directly performed over disks distributed to all client nodes. We introduce a new distributed disk array, called the RAID-x, for use in serverless clusters. The RAID-x architecture is based on an orthogonal striping and mirroring (OSM) scheme, which exploits full-bandwidth and protects the system from all single disk failures. The performance of the RAID-x is experimentally proven superior to RAID-1 and NFS in the Linux cluster environment. We propose a new striped checkpointing scheme, leveraging on striped parallelism and pipelined writing of successive disk stripes. This RAID-x architecture greatly enhances the throughput, reliability, and availability of scalable clusters. It appeals especially to I/O-centric cluster applications.