故障检测算法的可靠执行并行程序

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI:10.1109/RELDIS.1995.526230

S. Chabridon, E. Gelenbe

{"title":"故障检测算法的可靠执行并行程序","authors":"S. Chabridon, E. Gelenbe","doi":"10.1109/RELDIS.1995.526230","DOIUrl":null,"url":null,"abstract":"We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the task graph and then simulate its execution under various values of the failure rates of processors.","PeriodicalId":275219,"journal":{"name":"Proceedings. 14th Symposium on Reliable Distributed Systems","volume":"213 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Failure detection algorithms for a reliable execution of parallel programs\",\"authors\":\"S. Chabridon, E. Gelenbe\",\"doi\":\"10.1109/RELDIS.1995.526230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the task graph and then simulate its execution under various values of the failure rates of processors.\",\"PeriodicalId\":275219,\"journal\":{\"name\":\"Proceedings. 14th Symposium on Reliable Distributed Systems\",\"volume\":\"213 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. 14th Symposium on Reliable Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RELDIS.1995.526230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 14th Symposium on Reliable Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RELDIS.1995.526230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

我们报告了新算法的设计和仿真，这些算法将确保应用软件在处理单元(PU)可能出现故障的MIMD系统上正确运行。随着故障率的增加，这些算法对随机任务图的效果进行了评估。还研究了一个特定应用程序的示例(快速傅立叶变换)，我们为此构建了任务图，然后模拟其在不同处理器故障率值下的执行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Failure detection algorithms for a reliable execution of parallel programs

We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the task graph and then simulate its execution under various values of the failure rates of processors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. 14th Symposium on Reliable Distributed Systems

自引率

0.00%

发文量