Efficient replication of queued tasks for latency reduction in cloud systems

Gauri Joshi, E. Soljanin, G. Wornell
{"title":"Efficient replication of queued tasks for latency reduction in cloud systems","authors":"Gauri Joshi, E. Soljanin, G. Wornell","doi":"10.1109/ALLERTON.2015.7446992","DOIUrl":null,"url":null,"abstract":"In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.","PeriodicalId":112948,"journal":{"name":"2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2015.7446992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 45

Abstract

In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.
在云系统中有效地复制排队任务以减少延迟
在云计算系统中,将作业分配给多台服务器并等待最早的副本完成是对抗单个服务器响应时间变化的有效方法。尽管添加冗余副本总是会减少服务时间,但每个作业花费的总计算时间可能会更高,从而增加队列中的等待时间。每个作业花费的总时间也与计算资源的成本成正比。我们分析了不同的冗余策略,例如:副本的数量以及发布和取消副本的时间会影响延迟和计算成本。我们了解到服务时间分布的对数凹性是决定增加冗余是否降低延迟和成本的关键因素。如果服务分布是log-凸的,那么添加最大冗余可以减少延迟和成本。如果它是log-凹的,那么拥有更少的副本并尽早取消冗余请求会更有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信