关于GridFTP传输吞吐量差异的原因

Zhengyang Liu, M. Veeraraghavan, Jianhui Zhou, Jason Hick, Yee-Ting Li
{"title":"关于GridFTP传输吞吐量差异的原因","authors":"Zhengyang Liu, M. Veeraraghavan, Jianhui Zhou, Jason Hick, Yee-Ting Li","doi":"10.1145/2534695.2534701","DOIUrl":null,"url":null,"abstract":"In prior work, we analyzed the GridFTP usage logs collected by data transfer nodes (DTNs) located at national scientific computing centers, and found significant throughput variance even among transfers between the same two end hosts. The goal of this work is to quantify the impact of various factors on throughput variance. Our methodology consisted of executing experiments on a high-speed research testbed, running large-sized instrumented transfers between operational DTNs, and creating statistical models from collected measurements. A non-linear regression model for memory-to-memory transfer throughput as a function of CPU usage at the two DTNs and packet loss rate was created. The model is useful for determining concomitant resource allocations to use in scheduling requests. For example, if a whole NERSC DTN CPU core can be assigned to the GridFTP process executing a large memory-to-memory transfer to SLAC, then only 32% of a CPU core is required at the SLAC DTN for the corresponding GridFTP process due to a difference in the computing speeds of these two DTNs. With these CPU allocations, data can be moved at 6.3 Gbps, which sets the rate to request from the circuit scheduler.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On causes of GridFTP transfer throughput variance\",\"authors\":\"Zhengyang Liu, M. Veeraraghavan, Jianhui Zhou, Jason Hick, Yee-Ting Li\",\"doi\":\"10.1145/2534695.2534701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In prior work, we analyzed the GridFTP usage logs collected by data transfer nodes (DTNs) located at national scientific computing centers, and found significant throughput variance even among transfers between the same two end hosts. The goal of this work is to quantify the impact of various factors on throughput variance. Our methodology consisted of executing experiments on a high-speed research testbed, running large-sized instrumented transfers between operational DTNs, and creating statistical models from collected measurements. A non-linear regression model for memory-to-memory transfer throughput as a function of CPU usage at the two DTNs and packet loss rate was created. The model is useful for determining concomitant resource allocations to use in scheduling requests. For example, if a whole NERSC DTN CPU core can be assigned to the GridFTP process executing a large memory-to-memory transfer to SLAC, then only 32% of a CPU core is required at the SLAC DTN for the corresponding GridFTP process due to a difference in the computing speeds of these two DTNs. With these CPU allocations, data can be moved at 6.3 Gbps, which sets the rate to request from the circuit scheduler.\",\"PeriodicalId\":108576,\"journal\":{\"name\":\"Network-aware Data Management\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Network-aware Data Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2534695.2534701\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Network-aware Data Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2534695.2534701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在之前的工作中,我们分析了位于国家科学计算中心的数据传输节点(dtn)收集的GridFTP使用日志,并发现即使在相同的两台终端主机之间的传输中也存在显著的吞吐量差异。这项工作的目标是量化各种因素对吞吐量变化的影响。我们的方法包括在高速研究试验台上执行实验,在运行的dtn之间运行大型仪器传输,并根据收集的测量数据创建统计模型。创建了内存到内存传输吞吐量的非线性回归模型,该模型是两个ddn下CPU使用率和丢包率的函数。该模型对于确定在调度请求中使用的伴随资源分配非常有用。例如,如果可以将整个NERSC DTN CPU核心分配给执行大量内存到内存传输到SLAC的GridFTP进程,那么由于这两个DTN的计算速度不同,相应的GridFTP进程在SLAC DTN上只需要32%的CPU核心。有了这些CPU分配,数据可以以6.3 Gbps的速度移动,这设置了从电路调度程序请求的速率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On causes of GridFTP transfer throughput variance
In prior work, we analyzed the GridFTP usage logs collected by data transfer nodes (DTNs) located at national scientific computing centers, and found significant throughput variance even among transfers between the same two end hosts. The goal of this work is to quantify the impact of various factors on throughput variance. Our methodology consisted of executing experiments on a high-speed research testbed, running large-sized instrumented transfers between operational DTNs, and creating statistical models from collected measurements. A non-linear regression model for memory-to-memory transfer throughput as a function of CPU usage at the two DTNs and packet loss rate was created. The model is useful for determining concomitant resource allocations to use in scheduling requests. For example, if a whole NERSC DTN CPU core can be assigned to the GridFTP process executing a large memory-to-memory transfer to SLAC, then only 32% of a CPU core is required at the SLAC DTN for the corresponding GridFTP process due to a difference in the computing speeds of these two DTNs. With these CPU allocations, data can be moved at 6.3 Gbps, which sets the rate to request from the circuit scheduler.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信