{"title":"异构分布式系统中的容错","authors":"Zhe Wang, N. Minsky","doi":"10.4108/ICST.COLLABORATECOM.2014.257585","DOIUrl":null,"url":null,"abstract":"Dependability of heterogeneous distributed systems is an important issue. Coordination failures may occur even if the given coordination protocol is adhered to by all participants. The fault tolerance (FT) properties of systems are difficult to achieve, especially at application level. What is common to current FT-techniques is their reliance on the code of the various system components, which are often required to be written in a specific language. From the viewpoint of distributed systems, such techniques are feasible for homogeneous systems, or at least systems that are designed and maintained by a single administrative domain. But such code-based techniques are generally unreliable for open systems, due to the lack of overall control over the code of components. This leaves open distributed systems vulnerable to their own faults and to attack on them. However, certain types of FT measures can be established in distributed systems by controlling the flow of messages between system components, independently of the code of system components-which we plan to do via a distributed coordination and control mechanism called Law-Governed Interaction. We demonstrate in this paper, there is a substantial range of FT measures that can be established completely by controlling messaging. Moreover, although the FT-measures to be developed are meant mostly for open systems, some of them can be useful for distributed systems in general, even where traditional code-based techniques are feasible.","PeriodicalId":432345,"journal":{"name":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"314 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Fault tolerance in heterogeneous distributed systems\",\"authors\":\"Zhe Wang, N. Minsky\",\"doi\":\"10.4108/ICST.COLLABORATECOM.2014.257585\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dependability of heterogeneous distributed systems is an important issue. Coordination failures may occur even if the given coordination protocol is adhered to by all participants. The fault tolerance (FT) properties of systems are difficult to achieve, especially at application level. What is common to current FT-techniques is their reliance on the code of the various system components, which are often required to be written in a specific language. From the viewpoint of distributed systems, such techniques are feasible for homogeneous systems, or at least systems that are designed and maintained by a single administrative domain. But such code-based techniques are generally unreliable for open systems, due to the lack of overall control over the code of components. This leaves open distributed systems vulnerable to their own faults and to attack on them. However, certain types of FT measures can be established in distributed systems by controlling the flow of messages between system components, independently of the code of system components-which we plan to do via a distributed coordination and control mechanism called Law-Governed Interaction. We demonstrate in this paper, there is a substantial range of FT measures that can be established completely by controlling messaging. Moreover, although the FT-measures to be developed are meant mostly for open systems, some of them can be useful for distributed systems in general, even where traditional code-based techniques are feasible.\",\"PeriodicalId\":432345,\"journal\":{\"name\":\"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"volume\":\"314 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/ICST.COLLABORATECOM.2014.257585\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/ICST.COLLABORATECOM.2014.257585","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fault tolerance in heterogeneous distributed systems
Dependability of heterogeneous distributed systems is an important issue. Coordination failures may occur even if the given coordination protocol is adhered to by all participants. The fault tolerance (FT) properties of systems are difficult to achieve, especially at application level. What is common to current FT-techniques is their reliance on the code of the various system components, which are often required to be written in a specific language. From the viewpoint of distributed systems, such techniques are feasible for homogeneous systems, or at least systems that are designed and maintained by a single administrative domain. But such code-based techniques are generally unreliable for open systems, due to the lack of overall control over the code of components. This leaves open distributed systems vulnerable to their own faults and to attack on them. However, certain types of FT measures can be established in distributed systems by controlling the flow of messages between system components, independently of the code of system components-which we plan to do via a distributed coordination and control mechanism called Law-Governed Interaction. We demonstrate in this paper, there is a substantial range of FT measures that can be established completely by controlling messaging. Moreover, although the FT-measures to be developed are meant mostly for open systems, some of them can be useful for distributed systems in general, even where traditional code-based techniques are feasible.