Improving Data Transfer Throughput with Direct Search Optimization

2016 45th International Conference on Parallel Processing (ICPP) Pub Date : 2016-08-01 DOI:10.1109/ICPP.2016.36

Prasanna Balaprakash, V. Morozov, R. Kettimuthu, Kalyan Kumaran, Ian T Foster

{"title":"Improving Data Transfer Throughput with Direct Search Optimization","authors":"Prasanna Balaprakash, V. Morozov, R. Kettimuthu, Kalyan Kumaran, Ian T Foster","doi":"10.1109/ICPP.2016.36","DOIUrl":null,"url":null,"abstract":"Improving data transfer throughput over high-speed long-distance networks has become increasingly difficult. Numerous factors such as nondeterministic congestion, dynamics of the transfer protocol, and multiuser and multitask source and destination endpoints, as well as interactions among these factors, contribute to this difficulty. A promising approach to improving throughput consists in using parallel streams at the application layer. We formulate and solve the problem of choosing the number of such streams from a mathematical optimization perspective. We propose the use of direct search methods, a class of easy-to-implement and light-weight mathematical optimization algorithms, to improve the performance of data transfers by dynamically adapting the number of parallel streams in a manner that does not require domain expertise, instrumentation, analytical models, or historic data. We apply our method to transfers performed with the GridFTP protocol, and illustrate the effectiveness of the proposed algorithm when used within Globus, a state-of-the-art data transfer tool, on production WAN links and servers. We show that when compared to user default settings our direct search methods can achieve up to 10x performance improvement under certain conditions. We also show that our method can overcome performance degradation due to external compute and network load on source end points, a common scenario at high performance computing facilities.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Improving data transfer throughput over high-speed long-distance networks has become increasingly difficult. Numerous factors such as nondeterministic congestion, dynamics of the transfer protocol, and multiuser and multitask source and destination endpoints, as well as interactions among these factors, contribute to this difficulty. A promising approach to improving throughput consists in using parallel streams at the application layer. We formulate and solve the problem of choosing the number of such streams from a mathematical optimization perspective. We propose the use of direct search methods, a class of easy-to-implement and light-weight mathematical optimization algorithms, to improve the performance of data transfers by dynamically adapting the number of parallel streams in a manner that does not require domain expertise, instrumentation, analytical models, or historic data. We apply our method to transfers performed with the GridFTP protocol, and illustrate the effectiveness of the proposed algorithm when used within Globus, a state-of-the-art data transfer tool, on production WAN links and servers. We show that when compared to user default settings our direct search methods can achieve up to 10x performance improvement under certain conditions. We also show that our method can overcome performance degradation due to external compute and network load on source end points, a common scenario at high performance computing facilities.

查看原文本刊更多论文

通过直接搜索优化提高数据传输吞吐量

提高高速长途网络的数据传输吞吐量变得越来越困难。许多因素，如不确定性拥塞、传输协议的动态性、多用户和多任务源和目标端点，以及这些因素之间的相互作用，都造成了这种困难。提高吞吐量的一个有希望的方法是在应用层使用并行流。我们从数学优化的角度来制定和解决这类流的数量选择问题。我们建议使用直接搜索方法，这是一种易于实现和轻量级的数学优化算法，通过动态调整并行流的数量来提高数据传输的性能，而不需要领域专业知识、仪器、分析模型或历史数据。我们将我们的方法应用于使用GridFTP协议执行的传输，并说明了在生产WAN链路和服务器上使用Globus(一种最先进的数据传输工具)时所提出算法的有效性。我们表明，与用户默认设置相比，我们的直接搜索方法在某些条件下可以实现高达10倍的性能改进。我们还表明，我们的方法可以克服由于源端点上的外部计算和网络负载而导致的性能下降，这是高性能计算设施的常见情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 45th International Conference on Parallel Processing (ICPP)

自引率

0.00%

发文量