机架缩放:一种高效的基于机架的重新分配方法，可加速云磁盘阵列的缩放

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2021-05-01 DOI:10.1109/IPDPS49936.2021.00098

Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo

{"title":"机架缩放:一种高效的基于机架的重新分配方法，可加速云磁盘阵列的缩放","authors":"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo","doi":"10.1109/IPDPS49936.2021.00098","DOIUrl":null,"url":null,"abstract":"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"11 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays\",\"authors\":\"Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, M. Guo\",\"doi\":\"10.1109/IPDPS49936.2021.00098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.\",\"PeriodicalId\":372234,\"journal\":{\"name\":\"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"11 18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS49936.2021.00098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在云存储系统中，磁盘阵列以其高可靠性和低成本的特点得到了广泛的应用。由于快速计算场景(例如黑色星期五或网络星期一的在线零售商服务)中的I/O爆发，大型云存储系统(如AWS S3和GFS)需要负担10XI/O工作负载。因此，云磁盘阵列的机架级扩展成为sprint业务的迫切需要。虽然现有的几种方法如Round-Robin(RR)和Scale-RS都被提出来加速扩展过程，但这些方法的效率有限。这是因为跨机架数据迁移在其设计中考虑不周。为了解决上述问题，本文提出了一种新的数据再分发方法rack - scaling，以加速云存储系统中机架级的扩展过程。Rack-Scaling的基本思想是在机架内和机架之间迁移适当的数据块，以实现统一的数据分布，同时最小化跨机架迁移，这比机架内迁移成本高。我们通过Disksim进行了模拟，并在Hadoop上实现了Rack-Scaling，以证明Rack-Scaling的有效性。结果表明，与Round-Robin (RR)、half -RR、Scale-RS和BDR等典型方法相比，Rack-Scaling可将I/O操作次数和跨机架传输的数据量分别减少90.4%和99.9%，并将扩展速度提高8.77倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays

In cloud storage systems, disk arrays are widely used because of their high reliability and low monetary cost. Due to the burst of I/O in sprinting computing scenarios (i.e. online retailer services on Black Friday or Cyber Monday), large scale cloud storage systems such as AWS S3 and GFS need to afford 10XI/O workloads. Therefore, rack level scaling for cloud disk arrays becomes urgent for sprinting services. Although several existing methods, such as Round-Robin(RR) and Scale-RS, are proposed to accelerate the scaling processes, the efficiencies of these approaches are limited. It is because that the cross-rack data migrations are ill-considered in their designs. To address the above problem, in this paper, we propose Rack-Scaling, a novel data redistribution method to accelerate rack level scaling process in cloud storage systems. The basic idea of Rack-Scaling is migrating appropriate data blocks within and among racks to achieve a uniform data distribution while minimizing the cross-rack migration, which costs more than intra-rack migration. We conduct simulations via Disksim and we also implement Rack-Scaling on Hadoop to demonstrate the effectiveness of Rack-Scaling. The results show that, compared to typical methods such as Round-Robin (RR), Semi-RR, Scale-RS and BDR, Rack-Scaling reduces the number of I/O operations and the data amount of cross-rack transmission by up to 90.4% and 99.9%, respectively, and speeds up the scaling by up to 8.77X.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量