Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo
{"title":"Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays","authors":"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo","doi":"10.1109/IPDPS49936.2021.00098","DOIUrl":null,"url":null,"abstract":"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"11 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.