DataMover: robust terabyte-scale multi-file replication over wide-area networks

A. Sim, Junmin Gu, A. Shoshani, V. Natarajan
{"title":"DataMover: robust terabyte-scale multi-file replication over wide-area networks","authors":"A. Sim, Junmin Gu, A. Shoshani, V. Natarajan","doi":"10.1109/SSDBM.2004.28","DOIUrl":null,"url":null,"abstract":"Typically, large scientific datasets (order of terabytes) are generated at large computational centers, and stored on mass storage systems. However, large subsets of the data need to be moved to facilities available to application scientists for analysis. File replication of thousands of files is a tedious, error prone, but extremely important task in scientific applications. The automation of the file replication task requires automatic space acquisition and reuse, and monitoring the progress of staging thousands of files from the source mass storage system, transferring them over the network, archiving them at the target mass storage system or disk systems, and recovering from transient system failures. We have developed a robust replication system, called DataMover, which is now in regular use in High-Energy-Physics and Climate modeling experiments. Only a single command is necessary to request multi-file replication or the replication of an entire directory. A Web-based tool was developed to dynamically monitor the progress of the multi-file replication process.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDBM.2004.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

Typically, large scientific datasets (order of terabytes) are generated at large computational centers, and stored on mass storage systems. However, large subsets of the data need to be moved to facilities available to application scientists for analysis. File replication of thousands of files is a tedious, error prone, but extremely important task in scientific applications. The automation of the file replication task requires automatic space acquisition and reuse, and monitoring the progress of staging thousands of files from the source mass storage system, transferring them over the network, archiving them at the target mass storage system or disk systems, and recovering from transient system failures. We have developed a robust replication system, called DataMover, which is now in regular use in High-Energy-Physics and Climate modeling experiments. Only a single command is necessary to request multi-file replication or the replication of an entire directory. A Web-based tool was developed to dynamically monitor the progress of the multi-file replication process.
DataMover:在广域网上实现强大的tb级多文件复制
通常,大型科学数据集(tb数量级)是在大型计算中心生成的,并存储在大容量存储系统中。然而,大量的数据需要转移到可供应用科学家分析的设施中。在科学应用中,复制数千个文件是一项繁琐且容易出错的任务,但却极其重要。文件复制任务的自动化需要自动获取和重用空间,监控从源大容量存储系统暂存数千个文件、通过网络传输这些文件、将它们归档到目标大容量存储系统或磁盘系统以及从瞬时系统故障中恢复这些文件的进度。我们已经开发了一个强大的复制系统,称为DataMover,它现在经常用于高能物理和气候模型实验。只需要一个命令就可以请求多文件复制或整个目录的复制。开发了一个基于web的工具来动态监控多文件复制过程的进度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信