{"title":"分布式系统中协作过程的灵活故障处理","authors":"Artin Avanes, J. Freytag","doi":"10.4108/ICST.COLLABORATECOM2009.8306","DOIUrl":null,"url":null,"abstract":"Distributed systems will be increasingly built on top of wireless networks, such as sensor networks or hand-held devices with advanced sensing and computational abilities. Supporting cooperative processes executed by such unreliable and dynamic system components poses a various number of new technical challenges. In terms of recovery, limited resource capabilities have be considered during re-scheduling of failed process activities. In terms of concurrency, a non-blocking protocol is required to allow a high degree of parallelism. In this paper, we introduce a flexible and resource-oriented failure handling mechanism for cooperative processes in hierarchical and distributed systems. The objective is to ensure both - transactional semantics as well as the selection of suitable nodes with respect to available resource capabilities. Based on a nested execution model, we develop a multi-stage algorithm that uses constraint solving techniques in a parallel fashion thus achieving a more efficient recovery. We evaluate our proposed techniques in a prototype implementation, and demonstrate significant performance gains by using a parallel re-scheduling.","PeriodicalId":232795,"journal":{"name":"2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Flexible failure handling for cooperative processes in distributed systems\",\"authors\":\"Artin Avanes, J. Freytag\",\"doi\":\"10.4108/ICST.COLLABORATECOM2009.8306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed systems will be increasingly built on top of wireless networks, such as sensor networks or hand-held devices with advanced sensing and computational abilities. Supporting cooperative processes executed by such unreliable and dynamic system components poses a various number of new technical challenges. In terms of recovery, limited resource capabilities have be considered during re-scheduling of failed process activities. In terms of concurrency, a non-blocking protocol is required to allow a high degree of parallelism. In this paper, we introduce a flexible and resource-oriented failure handling mechanism for cooperative processes in hierarchical and distributed systems. The objective is to ensure both - transactional semantics as well as the selection of suitable nodes with respect to available resource capabilities. Based on a nested execution model, we develop a multi-stage algorithm that uses constraint solving techniques in a parallel fashion thus achieving a more efficient recovery. We evaluate our proposed techniques in a prototype implementation, and demonstrate significant performance gains by using a parallel re-scheduling.\",\"PeriodicalId\":232795,\"journal\":{\"name\":\"2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/ICST.COLLABORATECOM2009.8306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/ICST.COLLABORATECOM2009.8306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Flexible failure handling for cooperative processes in distributed systems
Distributed systems will be increasingly built on top of wireless networks, such as sensor networks or hand-held devices with advanced sensing and computational abilities. Supporting cooperative processes executed by such unreliable and dynamic system components poses a various number of new technical challenges. In terms of recovery, limited resource capabilities have be considered during re-scheduling of failed process activities. In terms of concurrency, a non-blocking protocol is required to allow a high degree of parallelism. In this paper, we introduce a flexible and resource-oriented failure handling mechanism for cooperative processes in hierarchical and distributed systems. The objective is to ensure both - transactional semantics as well as the selection of suitable nodes with respect to available resource capabilities. Based on a nested execution model, we develop a multi-stage algorithm that uses constraint solving techniques in a parallel fashion thus achieving a more efficient recovery. We evaluate our proposed techniques in a prototype implementation, and demonstrate significant performance gains by using a parallel re-scheduling.