{"title":"一种低开销容错的多处理器调度算法","authors":"Koji Hashimoto, Tatsuhiro Tsuchiya, T. Kikuno","doi":"10.1109/RELDIS.1998.740493","DOIUrl":null,"url":null,"abstract":"We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A multiprocessor scheduling algorithm for low overhead fault-tolerance\",\"authors\":\"Koji Hashimoto, Tatsuhiro Tsuchiya, T. Kikuno\",\"doi\":\"10.1109/RELDIS.1998.740493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.\",\"PeriodicalId\":376253,\"journal\":{\"name\":\"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RELDIS.1998.740493\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RELDIS.1998.740493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A multiprocessor scheduling algorithm for low overhead fault-tolerance
We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.