{"title":"面向网络物理系统和灾难管理的大规模分布式发布和订阅服务的服务恢复","authors":"C. Shih, Hsin-Yi Chen, Zi-You Yeh","doi":"10.1109/CPSNA.2014.27","DOIUrl":null,"url":null,"abstract":"Information and communication technology (ICT) played a critical role in disaster management in last few decades. One example is the messaging service for disaster alerts, rescue workers, and victims. Many of these messaging services are developing based on existing messaging services and in ad hoc manner to meet the communication requirements in different disaster scenario. However, most, if not all, existing messaging services are designed under the assumption that the underlying network infrastructures are mostly reliable. Unfortunately, this assumption is not valid during and after disasters. In this work, we designed and implemented a service recovery framework, including a landmark-based/centralized algorithm and a distributed algorithm, for publish/subscribe messaging services for disaster management. The developed mechanisms recover a failed service without manual efforts. The centralized algorithm uses a landmark node to monitor the services and to recover the failed one, the distributed algorithm is a Paxos-based algorithm to compile a consistent recovery plan among nodes, monitoring the failed service. We evaluated the performance for these two mechanisms, and discussed the proper use scenario for these two mechanisms. The results show that the centralized algorithm should only be used in a service network having no concurrent failure within a local area network, the distributed algorithm are neither sensitive to concurrent failures nor the size of service networks.","PeriodicalId":254175,"journal":{"name":"2014 IEEE International Conference on Cyber-Physical Systems, Networks, and Applications","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Service Recovery for Large Scale Distributed Publish and Subscription Services for Cyber-Physical Systems and Disaster Management\",\"authors\":\"C. Shih, Hsin-Yi Chen, Zi-You Yeh\",\"doi\":\"10.1109/CPSNA.2014.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information and communication technology (ICT) played a critical role in disaster management in last few decades. One example is the messaging service for disaster alerts, rescue workers, and victims. Many of these messaging services are developing based on existing messaging services and in ad hoc manner to meet the communication requirements in different disaster scenario. However, most, if not all, existing messaging services are designed under the assumption that the underlying network infrastructures are mostly reliable. Unfortunately, this assumption is not valid during and after disasters. In this work, we designed and implemented a service recovery framework, including a landmark-based/centralized algorithm and a distributed algorithm, for publish/subscribe messaging services for disaster management. The developed mechanisms recover a failed service without manual efforts. The centralized algorithm uses a landmark node to monitor the services and to recover the failed one, the distributed algorithm is a Paxos-based algorithm to compile a consistent recovery plan among nodes, monitoring the failed service. We evaluated the performance for these two mechanisms, and discussed the proper use scenario for these two mechanisms. The results show that the centralized algorithm should only be used in a service network having no concurrent failure within a local area network, the distributed algorithm are neither sensitive to concurrent failures nor the size of service networks.\",\"PeriodicalId\":254175,\"journal\":{\"name\":\"2014 IEEE International Conference on Cyber-Physical Systems, Networks, and Applications\",\"volume\":\"109 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Cyber-Physical Systems, Networks, and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CPSNA.2014.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Cyber-Physical Systems, Networks, and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CPSNA.2014.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Service Recovery for Large Scale Distributed Publish and Subscription Services for Cyber-Physical Systems and Disaster Management
Information and communication technology (ICT) played a critical role in disaster management in last few decades. One example is the messaging service for disaster alerts, rescue workers, and victims. Many of these messaging services are developing based on existing messaging services and in ad hoc manner to meet the communication requirements in different disaster scenario. However, most, if not all, existing messaging services are designed under the assumption that the underlying network infrastructures are mostly reliable. Unfortunately, this assumption is not valid during and after disasters. In this work, we designed and implemented a service recovery framework, including a landmark-based/centralized algorithm and a distributed algorithm, for publish/subscribe messaging services for disaster management. The developed mechanisms recover a failed service without manual efforts. The centralized algorithm uses a landmark node to monitor the services and to recover the failed one, the distributed algorithm is a Paxos-based algorithm to compile a consistent recovery plan among nodes, monitoring the failed service. We evaluated the performance for these two mechanisms, and discussed the proper use scenario for these two mechanisms. The results show that the centralized algorithm should only be used in a service network having no concurrent failure within a local area network, the distributed algorithm are neither sensitive to concurrent failures nor the size of service networks.