{"title":"并行程序中基于分类的容错支持方法","authors":"Gopinatha Jakadeesan, D. Goswami","doi":"10.1109/PDCAT.2009.47","DOIUrl":null,"url":null,"abstract":"Fault tolerance is an important requirement for long-running parallel programs. This paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. The presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer’s main objectives (separation-of-concerns). The paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"A Classification-Based Approach to Fault-Tolerance Support in Parallel Programs\",\"authors\":\"Gopinatha Jakadeesan, D. Goswami\",\"doi\":\"10.1109/PDCAT.2009.47\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fault tolerance is an important requirement for long-running parallel programs. This paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. The presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer’s main objectives (separation-of-concerns). The paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.\",\"PeriodicalId\":312929,\"journal\":{\"name\":\"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2009.47\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2009.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Classification-Based Approach to Fault-Tolerance Support in Parallel Programs
Fault tolerance is an important requirement for long-running parallel programs. This paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. The presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer’s main objectives (separation-of-concerns). The paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.