Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo
{"title":"机架缩放:一种高效的基于机架的重新分配方法,可加速云磁盘阵列的缩放","authors":"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo","doi":"10.1109/IPDPS49936.2021.00098","DOIUrl":null,"url":null,"abstract":"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"11 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays\",\"authors\":\"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo\",\"doi\":\"10.1109/IPDPS49936.2021.00098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.\",\"PeriodicalId\":372234,\"journal\":{\"name\":\"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"11 18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS49936.2021.00098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays
In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.