Network-aware Data Management最新文献

筛选
英文 中文
Network-aware data caching and prefetching for cloud-hosted metadata retrieval 用于云托管元数据检索的网络感知数据缓存和预取
Network-aware Data Management Pub Date : 2013-11-17 DOI: 10.1145/2534695.2534700
Bing Zhang, Brandon Ross, Sanatkumar Tripathi, Sonali Batra, T. Kosar
{"title":"Network-aware data caching and prefetching for cloud-hosted metadata retrieval","authors":"Bing Zhang, Brandon Ross, Sanatkumar Tripathi, Sonali Batra, T. Kosar","doi":"10.1145/2534695.2534700","DOIUrl":"https://doi.org/10.1145/2534695.2534700","url":null,"abstract":"With the overwhelming emergence of data-intensive applications in the Cloud, the wide-area transfer of metadata and other descriptive information about remote data is critically important for searching, indexing, and enumerating remote file system hierarchies, as well as for purposes of data transfer estimation and reservation. In this paper, we present a highly efficient network-aware caching and prefetching mechanism tailored to reduce metadata access latency and improve responsiveness in wide-area data transfers. To improve the maximum requests per second (RPS) handled by the system, we designed and implemented a network-aware prefetching service using dynamically provisioned parallel TCP streams. To improve the performance of accessing local metadata, we designed and implemented a non-blocking concurrent in-memory cache to handle unexpected bursts of requests. We have implemented the proposed mechanisms in the Directory Listing Service (DLS) system---a Cloud-hosted metadata retrieval, caching, and prefetching system, and have evaluated its performance on Amazon EC2 and XSEDE.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126401396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The practical obstacles of data transfer: why researchers still love scp 数据传输的实际障碍:为什么研究人员仍然喜欢scp
Network-aware Data Management Pub Date : 2013-11-17 DOI: 10.1145/2534695.2534703
H. Nam, Jason Hill, S. Parete-Koon
{"title":"The practical obstacles of data transfer: why researchers still love scp","authors":"H. Nam, Jason Hill, S. Parete-Koon","doi":"10.1145/2534695.2534703","DOIUrl":"https://doi.org/10.1145/2534695.2534703","url":null,"abstract":"The importance of computing facilities is heralded every six months with the announcement of the new Top500 list, showcasing the world's fastest supercomputers. Unfortunately, with great computing capability does not come great long-term data storage capacity, which often means users must move their data to their local site archive, to remote sites where they may be doing future computation or analysis, or back to their home institution, else face the dreaded data purge that most HPC centers employ to keep utilization of large parallel filesystems low to manage performance and capacity. At HPC centers, data transfer is crucial to the scientific workflow and will increase in importance as computing systems grow in size. The Energy Sciences Network (ESnet) recently launched its fifth generation network, a 100 Gbps high-performance, unclassified national network connecting more than 40 DOE research sites to support scientific research and collaboration. Despite the tenfold increase in bandwidth to DOE research sites amenable to multiple data transfer streams and high throughput, in practice, researchers often under-utilize the network and resort to painfully-slow single stream transfer methods such as scp to avoid the complexity of using multiple stream tools such as GridFTP and bbcp, and contend with frustration from the lack of consistency of available tools between sites. In this study we survey and assess the data transfer methods provided at several DOE supported computing facilities, including both leadership-computing facilities, connected through ESnet. We present observed transfer rates, suggested optimizations, and discuss the obstacles the tools must overcome to receive wide-spread adoption over scp.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132544427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On causes of GridFTP transfer throughput variance 关于GridFTP传输吞吐量差异的原因
Network-aware Data Management Pub Date : 2013-11-17 DOI: 10.1145/2534695.2534701
Zhengyang Liu, M. Veeraraghavan, Jianhui Zhou, Jason Hick, Yee-Ting Li
{"title":"On causes of GridFTP transfer throughput variance","authors":"Zhengyang Liu, M. Veeraraghavan, Jianhui Zhou, Jason Hick, Yee-Ting Li","doi":"10.1145/2534695.2534701","DOIUrl":"https://doi.org/10.1145/2534695.2534701","url":null,"abstract":"In prior work, we analyzed the GridFTP usage logs collected by data transfer nodes (DTNs) located at national scientific computing centers, and found significant throughput variance even among transfers between the same two end hosts. The goal of this work is to quantify the impact of various factors on throughput variance. Our methodology consisted of executing experiments on a high-speed research testbed, running large-sized instrumented transfers between operational DTNs, and creating statistical models from collected measurements. A non-linear regression model for memory-to-memory transfer throughput as a function of CPU usage at the two DTNs and packet loss rate was created. The model is useful for determining concomitant resource allocations to use in scheduling requests. For example, if a whole NERSC DTN CPU core can be assigned to the GridFTP process executing a large memory-to-memory transfer to SLAC, then only 32% of a CPU core is required at the SLAC DTN for the corresponding GridFTP process due to a difference in the computing speeds of these two DTNs. With these CPU allocations, data can be moved at 6.3 Gbps, which sets the rate to request from the circuit scheduler.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127715178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Evaluating I/O aware network management for scientific workflows on networked clouds 评估网络云上科学工作流的I/O感知网络管理
Network-aware Data Management Pub Date : 2013-11-17 DOI: 10.1145/2534695.2534698
A. Mandal, P. Ruth, I. Baldin, Yufeng Xin, C. Castillo, M. Rynge, E. Deelman
{"title":"Evaluating I/O aware network management for scientific workflows on networked clouds","authors":"A. Mandal, P. Ruth, I. Baldin, Yufeng Xin, C. Castillo, M. Rynge, E. Deelman","doi":"10.1145/2534695.2534698","DOIUrl":"https://doi.org/10.1145/2534695.2534698","url":null,"abstract":"This paper presents a performance evaluation of scientific workflows on networked cloud systems with particular emphasis on evaluating the effect of provisioned network bandwidth on application I/O performance. The experiments were run on ExoGENI, a widely distributed networked infrastructure as a service (NIaaS) testbed. ExoGENI orchestrates a federation of independent cloud sites located around the world along with backbone circuit providers. The evaluation used a representative data-intensive scientific workflow application called Montage. The application was deployed on a virtualized HTCondor environment provisioned dynamically from the ExoGENI networked cloud testbed, and managed by the Pegasus workflow manager.\u0000 The results of our experiments show the effect of modifying provisioned network bandwidth on disk I/O throughput and workflow execution time. The marginal benefit as perceived by the workflow reduces as the network bandwidth allocation increases to a point where disk I/O saturates. There is little or no benefit from increasing network bandwidth beyond this inflection point. The results also underline the importance of network and I/O performance isolation for predictable application performance, and are applicable for general data-intensive workloads. Insights from this work will also be useful for real-time monitoring, application steering and infrastructure planning for data-intensive workloads on networked cloud platforms.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126067750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信