{"title":"Web归档的分级存储系统和文件格式","authors":"H. Kawano","doi":"10.1109/ICSEng.2011.46","DOIUrl":null,"url":null,"abstract":"Many national libraries are making efforts to crawl and store various born-digital information, there are many difficult problems of the social, legal and technical aspects. In this paper, from the view points of long-term preservation of digital contents, we focus on the the urgent task of storage system, since the size of the web archive is increasing exponentially. In order to archive monotonously increasing contents, we discuss management of storage devices and file formats in web archive systems. Firstly, we propose an architecture of hierarchical storage system based on characteristics of storage devices and file compression formats. Next, we modify the file moving algorithm by using file access frequency. We also evaluate the performance of our proposed algorithm with predicted data based on actual statistics of a web archive system.","PeriodicalId":387483,"journal":{"name":"2011 21st International Conference on Systems Engineering","volume":"283 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hierarchical Storage Systems and File Formats for Web Archiving\",\"authors\":\"H. Kawano\",\"doi\":\"10.1109/ICSEng.2011.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many national libraries are making efforts to crawl and store various born-digital information, there are many difficult problems of the social, legal and technical aspects. In this paper, from the view points of long-term preservation of digital contents, we focus on the the urgent task of storage system, since the size of the web archive is increasing exponentially. In order to archive monotonously increasing contents, we discuss management of storage devices and file formats in web archive systems. Firstly, we propose an architecture of hierarchical storage system based on characteristics of storage devices and file compression formats. Next, we modify the file moving algorithm by using file access frequency. We also evaluate the performance of our proposed algorithm with predicted data based on actual statistics of a web archive system.\",\"PeriodicalId\":387483,\"journal\":{\"name\":\"2011 21st International Conference on Systems Engineering\",\"volume\":\"283 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 21st International Conference on Systems Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSEng.2011.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 21st International Conference on Systems Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSEng.2011.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hierarchical Storage Systems and File Formats for Web Archiving
Many national libraries are making efforts to crawl and store various born-digital information, there are many difficult problems of the social, legal and technical aspects. In this paper, from the view points of long-term preservation of digital contents, we focus on the the urgent task of storage system, since the size of the web archive is increasing exponentially. In order to archive monotonously increasing contents, we discuss management of storage devices and file formats in web archive systems. Firstly, we propose an architecture of hierarchical storage system based on characteristics of storage devices and file compression formats. Next, we modify the file moving algorithm by using file access frequency. We also evaluate the performance of our proposed algorithm with predicted data based on actual statistics of a web archive system.