{"title":"Fault tolerant heterogeneous scheduling for precedence constrained task graphs using simulated annealing","authors":"Hassan A. Youness, A. Omar, M. Moness","doi":"10.1109/ICCES.2013.6707224","DOIUrl":null,"url":null,"abstract":"Scheduling is known to be an NP complete problem in most cases that has no optimal solution in polynomial time. Scheduling task graphs on heterogeneous architecture increases the difficulty of the problem. These heterogeneous architectures like any other platforms are prone to faults thus fault tolerance techniques must be used to ensure accomplishment of the job therefore task replication is used to achieve fault tolerance. However scheduling complexity is increased and the schedule length is affected dramatically due to duplication. Also task replication introduces great communication delays overhead. Here we propose the use of simulated annealing optimization method to find optimal solution according to platform reliability, where the algorithm can be used to minimize lower bound makespan on high reliability platforms and genuinely optimize upper bound makespan for platforms that are prone to failures.","PeriodicalId":277807,"journal":{"name":"2013 8th International Conference on Computer Engineering & Systems (ICCES)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 8th International Conference on Computer Engineering & Systems (ICCES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCES.2013.6707224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Scheduling is known to be an NP complete problem in most cases that has no optimal solution in polynomial time. Scheduling task graphs on heterogeneous architecture increases the difficulty of the problem. These heterogeneous architectures like any other platforms are prone to faults thus fault tolerance techniques must be used to ensure accomplishment of the job therefore task replication is used to achieve fault tolerance. However scheduling complexity is increased and the schedule length is affected dramatically due to duplication. Also task replication introduces great communication delays overhead. Here we propose the use of simulated annealing optimization method to find optimal solution according to platform reliability, where the algorithm can be used to minimize lower bound makespan on high reliability platforms and genuinely optimize upper bound makespan for platforms that are prone to failures.