预测零星的网格数据传输

Sudharshan S. Vazhkudai, J. Schopf
{"title":"预测零星的网格数据传输","authors":"Sudharshan S. Vazhkudai, J. Schopf","doi":"10.1109/HPDC.2002.1029918","DOIUrl":null,"url":null,"abstract":"The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"88","resultStr":"{\"title\":\"Predicting sporadic grid data transfers\",\"authors\":\"Sudharshan S. Vazhkudai, J. Schopf\",\"doi\":\"10.1109/HPDC.2002.1029918\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.\",\"PeriodicalId\":279053,\"journal\":{\"name\":\"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"88\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPDC.2002.1029918\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPDC.2002.1029918","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 88

摘要

在网格环境中,复制数据集和使用资源作为分布式数据存储的做法越来越普遍,这导致了确定哪个副本可以最有效地访问的问题。由于连接这些不同位置的端到端路径中多个组件的不同性能特征和负载变化,从众多副本中选择副本需要准确预测源和汇之间的数据传输时间信息。在本文中,我们提出了一个基于端到端应用程序吞吐量观察和网络负载变化相结合的预测系统,分别捕获整个系统的性能和负载模式的变化。我们开发了一组回归模型,以得出表征网络负载变化对文件传输时间影响的预测。我们将这些技术应用于GridFTP数据移动工具(Globus Toolkit/spl trade/的一部分),并观察到与孤立地基于过去系统行为的方法相比,预测精度的性能提高高达10%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predicting sporadic grid data transfers
The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信