Comparative Performance Evaluation of High-performance Data Transfer Tools

Deepak Nadig, Eun-Sung Jung, R. Kettimuthu, Ian T Foster, N. Rao, B. Ramamurthy
{"title":"Comparative Performance Evaluation of High-performance Data Transfer Tools","authors":"Deepak Nadig, Eun-Sung Jung, R. Kettimuthu, Ian T Foster, N. Rao, B. Ramamurthy","doi":"10.1109/ANTS.2018.8710071","DOIUrl":null,"url":null,"abstract":"Data transfer in wide-area networks has been long studied in different contexts, from data sharing among data centers to online access to scientific data. Many software tools and platforms have been developed to facilitate easy, reliable, fast, and secure data transfer over wide area networks, such as GridFTP, FDT, bbcp, mdtmFTP, and XDD. However, few studies have shown the full capabilities of existing data transfer tools from the perspective of whether such tools have fully adopted state-of-the-art techniques through meticulous comparative evaluations. In this paper, we evaluate the performance of the four highperformance data transfer tools (GridFTP, FDT, mdtmFTP, and XDD) in various environments. Our evaluation suggests that each tool has strengths and weaknesses. FDT and GridFTP perform consistently in diverse environments. XDD and mdtmFTP show improved performance in limited environments and datasets during our evaluation. Unlike other studies on data transfer tools, we also evaluate the predictability of the tools’ performance, an important factor for scheduling different stages of science workflows. Performance predictability also helps in (auto)tuning the configurable parameters of the data transfer tool. We apply statistical learning techniques such as linear/polynomial regression, and k-nearest neighbors (kNN), to assess the performance predictability of each tool using its control parameters. Our results show that we can achieve good prediction performance for GridFTP and mdtmFTP using linear regression and kNN, respectively.","PeriodicalId":273443,"journal":{"name":"2018 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANTS.2018.8710071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Data transfer in wide-area networks has been long studied in different contexts, from data sharing among data centers to online access to scientific data. Many software tools and platforms have been developed to facilitate easy, reliable, fast, and secure data transfer over wide area networks, such as GridFTP, FDT, bbcp, mdtmFTP, and XDD. However, few studies have shown the full capabilities of existing data transfer tools from the perspective of whether such tools have fully adopted state-of-the-art techniques through meticulous comparative evaluations. In this paper, we evaluate the performance of the four highperformance data transfer tools (GridFTP, FDT, mdtmFTP, and XDD) in various environments. Our evaluation suggests that each tool has strengths and weaknesses. FDT and GridFTP perform consistently in diverse environments. XDD and mdtmFTP show improved performance in limited environments and datasets during our evaluation. Unlike other studies on data transfer tools, we also evaluate the predictability of the tools’ performance, an important factor for scheduling different stages of science workflows. Performance predictability also helps in (auto)tuning the configurable parameters of the data transfer tool. We apply statistical learning techniques such as linear/polynomial regression, and k-nearest neighbors (kNN), to assess the performance predictability of each tool using its control parameters. Our results show that we can achieve good prediction performance for GridFTP and mdtmFTP using linear regression and kNN, respectively.
高性能数据传输工具的比较性能评估
从数据中心之间的数据共享到科学数据的在线访问,广域网中的数据传输已经在不同的背景下进行了长期的研究。已经开发了许多软件工具和平台来促进在广域网上简单、可靠、快速和安全的数据传输,例如GridFTP、FDT、bbcp、mdtmFTP和XDD。然而,很少有研究从这些工具是否通过细致的比较评价充分采用最先进的技术的角度显示现有数据传输工具的全部能力。在本文中,我们评估了四种高性能数据传输工具(GridFTP, FDT, mdtmFTP和XDD)在各种环境中的性能。我们的评估表明每个工具都有优点和缺点。FDT和GridFTP在不同的环境中表现一致。在我们的评估过程中,XDD和mdtmFTP在有限的环境和数据集中显示出改进的性能。与其他关于数据传输工具的研究不同,我们还评估了工具性能的可预测性,这是调度科学工作流程不同阶段的重要因素。性能可预测性还有助于(自动)调优数据传输工具的可配置参数。我们应用统计学习技术,如线性/多项式回归和k近邻(kNN),来评估每个工具使用其控制参数的性能可预测性。我们的结果表明,我们可以分别使用线性回归和kNN对GridFTP和mdtmFTP获得良好的预测性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信