{"title":"Large-scale scientific data and long-term data storage function in a computing center","authors":"Dmitry Vladimirovich Ivankov","doi":"10.15514/ispras-2022-34(4)-9","DOIUrl":null,"url":null,"abstract":"Long-term data storing is an important task for many modern scientific laboratories and datacenters. In order to reduce cost of digital information ownership, some solutions use magnetic tape technology and special software to control medium and data. Considering the on-site infrastructure specifics and well-established workflows of data processing, these organizations build and support such systems mainly by their own efforts, what becomes an important task in seeking to acquire the technological sovereignty. This paper describes long-term data storage issues in the computing center of the Zababakhin All-Russia Research Institute of Technical Physics where mathematical modeling computations generate vast amount of scientific data. The architecture and functional composition of the developed Archive Data Storage System are given as well as its internal data model, the chunk grouping rules, and the low-level tape format used. The measures taken to ensure an archived data consistency, methods of storage media management and issues of archival fund maintenance, are also considered. The calculation scheme of a typical archive system site’s hardware configuration, sufficient to process archiving data flows existing in datacenter, is given.","PeriodicalId":33459,"journal":{"name":"Trudy Instituta sistemnogo programmirovaniia RAN","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trudy Instituta sistemnogo programmirovaniia RAN","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15514/ispras-2022-34(4)-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Long-term data storing is an important task for many modern scientific laboratories and datacenters. In order to reduce cost of digital information ownership, some solutions use magnetic tape technology and special software to control medium and data. Considering the on-site infrastructure specifics and well-established workflows of data processing, these organizations build and support such systems mainly by their own efforts, what becomes an important task in seeking to acquire the technological sovereignty. This paper describes long-term data storage issues in the computing center of the Zababakhin All-Russia Research Institute of Technical Physics where mathematical modeling computations generate vast amount of scientific data. The architecture and functional composition of the developed Archive Data Storage System are given as well as its internal data model, the chunk grouping rules, and the low-level tape format used. The measures taken to ensure an archived data consistency, methods of storage media management and issues of archival fund maintenance, are also considered. The calculation scheme of a typical archive system site’s hardware configuration, sufficient to process archiving data flows existing in datacenter, is given.