Efficient replication of queued tasks for latency reduction in cloud systems

2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) Pub Date : 2015-10-15 DOI:10.1109/ALLERTON.2015.7446992

Gauri Joshi, E. Soljanin, G. Wornell

引用次数: 45

Abstract

In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.

查看原文本刊更多论文

在云系统中有效地复制排队任务以减少延迟

在云计算系统中，将作业分配给多台服务器并等待最早的副本完成是对抗单个服务器响应时间变化的有效方法。尽管添加冗余副本总是会减少服务时间，但每个作业花费的总计算时间可能会更高，从而增加队列中的等待时间。每个作业花费的总时间也与计算资源的成本成正比。我们分析了不同的冗余策略，例如:副本的数量以及发布和取消副本的时间会影响延迟和计算成本。我们了解到服务时间分布的对数凹性是决定增加冗余是否降低延迟和成本的关键因素。如果服务分布是log-凸的，那么添加最大冗余可以减少延迟和成本。如果它是log-凹的，那么拥有更少的副本并尽早取消冗余请求会更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)

自引率

0.00%

发文量