{"title":"Study and Optimize the Process of Batch Small Files Replication","authors":"Liang Xiao, Q. Cao, C. Xie, Chuanwen Wu","doi":"10.1109/FCST.2008.32","DOIUrl":null,"url":null,"abstract":"I/O performance is always the traditional criterion for the evaluation of storage system. Many researches have been being carried on how to improve the storage system performance, mainly focusing on the storage architecture and I/O optimization for the storage devices. In many application systems, the phenomenon of replicating batch small files between two locations widely exists and always represents poor performance in systems. This paper analyzes and optimizes replication process for batch small files in Linux file system. In local case, six algorithms are achieved by using parallel, consecutive and aggregating polices in different stages of the whole process. In network case, achieve and compress strategies are also introduced and compared with aggregating algorithm. Moreover, the average latency of basic operations in each stage of file I/O can be estimated accurately, which is helpful for future research of file system. The experiment shows that the algorithm of consecutive reading source files and parallel writing target files have the best performance in local replication, and aggregating algorithm also do in network replication.","PeriodicalId":206207,"journal":{"name":"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCST.2008.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
I/O performance is always the traditional criterion for the evaluation of storage system. Many researches have been being carried on how to improve the storage system performance, mainly focusing on the storage architecture and I/O optimization for the storage devices. In many application systems, the phenomenon of replicating batch small files between two locations widely exists and always represents poor performance in systems. This paper analyzes and optimizes replication process for batch small files in Linux file system. In local case, six algorithms are achieved by using parallel, consecutive and aggregating polices in different stages of the whole process. In network case, achieve and compress strategies are also introduced and compared with aggregating algorithm. Moreover, the average latency of basic operations in each stage of file I/O can be estimated accurately, which is helpful for future research of file system. The experiment shows that the algorithm of consecutive reading source files and parallel writing target files have the best performance in local replication, and aggregating algorithm also do in network replication.