一个优雅的充分性:数据传输的负载感知差异化调度

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2015-11-15 DOI:10.1145/2807591.2807660

R. Kettimuthu, Gayane Vardoyan, G. Agrawal, P. Sadayappan, Ian T Foster

{"title":"一个优雅的充分性:数据传输的负载感知差异化调度","authors":"R. Kettimuthu, Gayane Vardoyan, G. Agrawal, P. Sadayappan, Ian T Foster","doi":"10.1145/2807591.2807660","DOIUrl":null,"url":null,"abstract":"We investigate the file transfer scheduling problem, where transfers among different endpoints must be scheduled to maximize pertinent metrics. We propose two new algorithms that exploit the fact that the aggregate bandwidth obtained over a network or at a storage system tends to increase with the number of concurrent transfers---but only up to a certain limit. The first algorithm, SEAL, uses runtime information and data-driven models to approximate system load and adapt transfer schedules and concurrency so as to maximize performance while avoiding saturation. We implement this algorithm using GridFTP as the transfer protocol and evaluate it using real transfer logs in a production WAN environment. Results show that SEAL can improve average slowdowns and turnaround times by up to 25% and worst-case slowdown and turnaround times by up to 50%, compared with the best-performing baseline scheme. Our second algorithm, STEAL, further leverages user-supplied categorization of transfers as either \"interactive\" (requiring immediate processing) or \"batch\" (less time-critical). Results show that STEAL reduces the average slowdown of interactive transfers by 63% compared to the best-performing baseline and by 21% compared to SEAL. For batch transfers, compared to the best-performing baseline, STEAL improves by 18% the utilization of the bandwidth unused by interactive transfers. By elegantly ensuring a sufficient, but not excessive, allocation of concurrency to the right transfers, we significantly improve overall performance despite constraints.","PeriodicalId":117494,"journal":{"name":"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"An elegant sufficiency: load-aware differentiated scheduling of data transfers\",\"authors\":\"R. Kettimuthu, Gayane Vardoyan, G. Agrawal, P. Sadayappan, Ian T Foster\",\"doi\":\"10.1145/2807591.2807660\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We investigate the file transfer scheduling problem, where transfers among different endpoints must be scheduled to maximize pertinent metrics. We propose two new algorithms that exploit the fact that the aggregate bandwidth obtained over a network or at a storage system tends to increase with the number of concurrent transfers---but only up to a certain limit. The first algorithm, SEAL, uses runtime information and data-driven models to approximate system load and adapt transfer schedules and concurrency so as to maximize performance while avoiding saturation. We implement this algorithm using GridFTP as the transfer protocol and evaluate it using real transfer logs in a production WAN environment. Results show that SEAL can improve average slowdowns and turnaround times by up to 25% and worst-case slowdown and turnaround times by up to 50%, compared with the best-performing baseline scheme. Our second algorithm, STEAL, further leverages user-supplied categorization of transfers as either \\\"interactive\\\" (requiring immediate processing) or \\\"batch\\\" (less time-critical). Results show that STEAL reduces the average slowdown of interactive transfers by 63% compared to the best-performing baseline and by 21% compared to SEAL. For batch transfers, compared to the best-performing baseline, STEAL improves by 18% the utilization of the bandwidth unused by interactive transfers. By elegantly ensuring a sufficient, but not excessive, allocation of concurrency to the right transfers, we significantly improve overall performance despite constraints.\",\"PeriodicalId\":117494,\"journal\":{\"name\":\"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2807591.2807660\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2807591.2807660","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

我们研究了文件传输调度问题，其中不同端点之间的传输必须被调度以最大化相关指标。我们提出了两种新的算法，它们利用了这样一个事实，即通过网络或存储系统获得的总带宽往往随着并发传输的数量而增加——但只能达到一定的限制。第一种算法SEAL使用运行时信息和数据驱动模型来近似系统负载，并调整传输计划和并发性，从而在避免饱和的同时最大化性能。我们使用GridFTP作为传输协议实现了该算法，并在生产WAN环境中使用真实的传输日志对其进行了评估。结果表明，与性能最好的基准方案相比，SEAL可以将平均减速和周转时间提高25%，将最坏情况下的减速和周转时间提高50%。我们的第二个算法，STEAL，进一步利用用户提供的传输分类，要么是“交互式的”(需要立即处理)，要么是“批处理的”(时间不那么紧迫)。结果表明，与性能最好的基线相比，STEAL将交互传输的平均速度降低了63%，与SEAL相比降低了21%。对于批处理传输，与性能最佳的基线相比，STEAL将交互传输未使用的带宽利用率提高了18%。通过优雅地确保为正确的传输分配足够(但不是过多)的并发性，我们可以显著提高总体性能，尽管存在约束。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An elegant sufficiency: load-aware differentiated scheduling of data transfers

We investigate the file transfer scheduling problem, where transfers among different endpoints must be scheduled to maximize pertinent metrics. We propose two new algorithms that exploit the fact that the aggregate bandwidth obtained over a network or at a storage system tends to increase with the number of concurrent transfers---but only up to a certain limit. The first algorithm, SEAL, uses runtime information and data-driven models to approximate system load and adapt transfer schedules and concurrency so as to maximize performance while avoiding saturation. We implement this algorithm using GridFTP as the transfer protocol and evaluate it using real transfer logs in a production WAN environment. Results show that SEAL can improve average slowdowns and turnaround times by up to 25% and worst-case slowdown and turnaround times by up to 50%, compared with the best-performing baseline scheme. Our second algorithm, STEAL, further leverages user-supplied categorization of transfers as either "interactive" (requiring immediate processing) or "batch" (less time-critical). Results show that STEAL reduces the average slowdown of interactive transfers by 63% compared to the best-performing baseline and by 21% compared to SEAL. For batch transfers, compared to the best-performing baseline, STEAL improves by 18% the utilization of the bandwidth unused by interactive transfers. By elegantly ensuring a sufficient, but not excessive, allocation of concurrency to the right transfers, we significantly improve overall performance despite constraints.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量