Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2021-05-01 DOI:10.1109/IPDPS49936.2021.00098

Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo

{"title":"Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays","authors":"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo","doi":"10.1109/IPDPS49936.2021.00098","DOIUrl":null,"url":null,"abstract":"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"11 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.

查看原文本刊更多论文

机架缩放:一种高效的基于机架的重新分配方法，可加速云磁盘阵列的缩放

在云存储系统中，磁盘阵列以其高可靠性和低成本的特点得到了广泛的应用。由于快速计算场景(例如黑色星期五或网络星期一的在线零售商服务)中的I/O爆发，大型云存储系统(如AWS S3和GFS)需要负担10XI/O工作负载。因此，云磁盘阵列的机架级扩展成为sprint业务的迫切需要。虽然现有的几种方法如Round-Robin(RR)和Scale-RS都被提出来加速扩展过程，但这些方法的效率有限。这是因为跨机架数据迁移在其设计中考虑不周。为了解决上述问题，本文提出了一种新的数据再分发方法rack - scaling，以加速云存储系统中机架级的扩展过程。Rack-Scaling的基本思想是在机架内和机架之间迁移适当的数据块，以实现统一的数据分布，同时最小化跨机架迁移，这比机架内迁移成本高。我们通过Disksim进行了模拟，并在Hadoop上实现了Rack-Scaling，以证明Rack-Scaling的有效性。结果表明，与Round-Robin (RR)、half -RR、Scale-RS和BDR等典型方法相比，Rack-Scaling可将I/O操作次数和跨机架传输的数据量分别减少90.4%和99.9%，并将扩展速度提高8.77倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量