{"title":"A planning based approach to failure recovery in distributed systems","authors":"N. Arshad, D. Heimbigner, A. Wolf","doi":"10.1145/1075405.1075407","DOIUrl":null,"url":null,"abstract":"Failure recovery in distributed systems poses a difficult challenge because of the requirement for high availability. Failure scenarios are usually unpredictable so they can not easily be foreseen. In this research we propose a planning based approach to failure recovery. This approach automates failure recovery by capturing the state after failure, defining an acceptable recovered state as a goal and applying planning to get from the initial state to the goal state. By using planning, this approach can recover from a variety of failed states and reach any of several acceptable states: from minimal functionality to complete recovery.","PeriodicalId":326554,"journal":{"name":"Workshop on Self-Healing Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"60","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Self-Healing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1075405.1075407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 60
Abstract
Failure recovery in distributed systems poses a difficult challenge because of the requirement for high availability. Failure scenarios are usually unpredictable so they can not easily be foreseen. In this research we propose a planning based approach to failure recovery. This approach automates failure recovery by capturing the state after failure, defining an acceptable recovered state as a goal and applying planning to get from the initial state to the goal state. By using planning, this approach can recover from a variety of failed states and reach any of several acceptable states: from minimal functionality to complete recovery.