{"title":"基于微重启的分布式应用自恢复模型","authors":"Huiqiang Wang, Haizhi Ye, Liang Ying","doi":"10.1109/ICICSE.2008.52","DOIUrl":null,"url":null,"abstract":"Automatic and fast recovery from failure is the important way of guaranteeing high availability for distributed application systems. On the base of microreboot techniques and autonomic computing ideas, key issues of realizing self-recovery for distributed application are analyzed in this paper, and then a novel model of self-recovery for distributed application based on microreboot is presented. The construction of the model are expatiated in detail from several perspectives, such as behavior monitoring, failure management and recovery policy, and the principles of realizing self- recovery for distributed application are explained. The established model aims to solve the problems of common failures in large distributed applications, and can recovery itself effectively without human interventions.","PeriodicalId":333889,"journal":{"name":"2008 International Conference on Internet Computing in Science and Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Self-Recovery Model for Distributed Applications Based on Microreboot\",\"authors\":\"Huiqiang Wang, Haizhi Ye, Liang Ying\",\"doi\":\"10.1109/ICICSE.2008.52\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic and fast recovery from failure is the important way of guaranteeing high availability for distributed application systems. On the base of microreboot techniques and autonomic computing ideas, key issues of realizing self-recovery for distributed application are analyzed in this paper, and then a novel model of self-recovery for distributed application based on microreboot is presented. The construction of the model are expatiated in detail from several perspectives, such as behavior monitoring, failure management and recovery policy, and the principles of realizing self- recovery for distributed application are explained. The established model aims to solve the problems of common failures in large distributed applications, and can recovery itself effectively without human interventions.\",\"PeriodicalId\":333889,\"journal\":{\"name\":\"2008 International Conference on Internet Computing in Science and Engineering\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Internet Computing in Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICSE.2008.52\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Internet Computing in Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSE.2008.52","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Self-Recovery Model for Distributed Applications Based on Microreboot
Automatic and fast recovery from failure is the important way of guaranteeing high availability for distributed application systems. On the base of microreboot techniques and autonomic computing ideas, key issues of realizing self-recovery for distributed application are analyzed in this paper, and then a novel model of self-recovery for distributed application based on microreboot is presented. The construction of the model are expatiated in detail from several perspectives, such as behavior monitoring, failure management and recovery policy, and the principles of realizing self- recovery for distributed application are explained. The established model aims to solve the problems of common failures in large distributed applications, and can recovery itself effectively without human interventions.