{"title":"Designing a robust namespace for distributed file services","authors":"Zheng Zhang, C. Karamanolis","doi":"10.1109/RELDIS.2001.969770","DOIUrl":null,"url":null,"abstract":"A number of ongoing research projects follow a partition-based approach to provide highly scalable distributed storage services. These systems maintain namespaces that reference objects distributed across multiple locations in the system. Typically, atomic commitment protocols, such as 2-phase commit, are used for updating the namespace, in order to guarantee its consistency even in the presence of failures. Atomic commitment protocols are known to impose a high overhead to failure-free execution. Furthermore, they use conservative recovery procedures and may considerably restrict the concurrency of overlapping operations in the system. This paper proposes a set of new protocols implementing the fundamental operations in a distributed namespace. The protocols impose a minimal overhead to failure-free execution. They are robust against both communication and host failures, and use aggressive recovery procedures to re-execute incomplete operations. The proposed protocols are compared with their 2-phase commit counterparts and are shown to outperform them in all critical performance factors: communication round-trips, synchronous I/O, operation concurrency.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"1013 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RELDIS.2001.969770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
A number of ongoing research projects follow a partition-based approach to provide highly scalable distributed storage services. These systems maintain namespaces that reference objects distributed across multiple locations in the system. Typically, atomic commitment protocols, such as 2-phase commit, are used for updating the namespace, in order to guarantee its consistency even in the presence of failures. Atomic commitment protocols are known to impose a high overhead to failure-free execution. Furthermore, they use conservative recovery procedures and may considerably restrict the concurrency of overlapping operations in the system. This paper proposes a set of new protocols implementing the fundamental operations in a distributed namespace. The protocols impose a minimal overhead to failure-free execution. They are robust against both communication and host failures, and use aggressive recovery procedures to re-execute incomplete operations. The proposed protocols are compared with their 2-phase commit counterparts and are shown to outperform them in all critical performance factors: communication round-trips, synchronous I/O, operation concurrency.