基于复制的异构计算系统最大可靠性任务调度算法

Shuli Wang, Kenli Li, Jing Mei, Kuan-Ching Li, Yan Wang
{"title":"基于复制的异构计算系统最大可靠性任务调度算法","authors":"Shuli Wang, Kenli Li, Jing Mei, Kuan-Ching Li, Yan Wang","doi":"10.1109/IPDPSW.2014.175","DOIUrl":null,"url":null,"abstract":"Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Task Scheduling Algorithm Based on Replication for Maximizing Reliability on Heterogeneous Computing Systems\",\"authors\":\"Shuli Wang, Kenli Li, Jing Mei, Kuan-Ching Li, Yan Wang\",\"doi\":\"10.1109/IPDPSW.2014.175\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.\",\"PeriodicalId\":153864,\"journal\":{\"name\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2014.175\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在过去的几年中,作为商业计算平台,异构计算(HC)系统比同构系统更具竞争力。随着HC系统规模的不断扩大,网络故障成为不可避免的问题。为了实现高性能,在设计可靠性感知任务调度算法时需要考虑通信可靠性。在本文中,我们提出了一种新的算法,称为RMSR(基于复制的调度最大化系统可靠性),该算法将任务通信纳入系统可靠性。为了使通信可靠性最大化,提出了一种针对当前任务搜索所有最优可靠性通信路径的改进算法。在任务复制阶段,任务可靠性阈值由用户决定,每个任务都有动态副本。基于随机生成图的对比研究表明,我们的RMSR算法在系统可靠性方面优于现有的调度算法。本文分析了影响其性能的几个因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Task Scheduling Algorithm Based on Replication for Maximizing Reliability on Heterogeneous Computing Systems
Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信