{"title":"Fluid-Shuttle: Efficient Cloud Data Transmission Based on Serverless Computing Compression","authors":"Rong Gu;Shulin Wang;Haipeng Dai;Xiaofei Chen;Zhaokang Wang;Wenjie Bao;Jiaqi Zheng;Yaofeng Tu;Yihua Huang;Lianyong Qi;Xiaolong Xu;Wanchun Dou;Guihai Chen","doi":"10.1109/TNET.2024.3402561","DOIUrl":null,"url":null,"abstract":"Nowadays, there exists a lot of cross-region data transmission demand on the cloud. It is promising to use serverless computing for data compressing to save the total data size. However, it is challenging to estimate the data transmission time and monetary cost with serverless compression. In addition, minimizing the data transmission cost is non-trivial due to the enormous parameter space. This paper focuses on this problem and makes the following contributions: 1) We propose empirical data transmission time and monetary cost models based on serverless compression. It can also predict compression information, e.g., ratio and speed using chunk sampling and machine learning techniques. 2) For single-task cloud data transmission, we propose two efficient parameter search methods based on Sequential Quadratic Programming (SQP) and Eliminate then Divide and Conquer (EDC) with proven error upper bounds. Besides, we propose a parameter fine-tuning strategy to deal with transmission bandwidth variance. 3) Furthermore, for multi-task scenarios, a parameter search method based on dynamic programming and numerical computation is proposed. We have implemented the system called Fluid-Shuttle, which includes straggler optimization, cache optimization, and the autoscaling decompression mechanism. Finally, we evaluate the performance of Fluid-Shuttle with various workloads and applications on the real-world AWS serverless computing platform. Experimental results show that the proposed approach can improve the parameter search efficiency by over \n<inline-formula> <tex-math>$3\\times $ </tex-math></inline-formula>\n compared with the state-of-art methods and achieves better parameter quality. In addition, our approach achieves higher time efficiency and lower monetary cost compared with competing cloud data transmission approaches.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 6","pages":"4554-4569"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10713271/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, there exists a lot of cross-region data transmission demand on the cloud. It is promising to use serverless computing for data compressing to save the total data size. However, it is challenging to estimate the data transmission time and monetary cost with serverless compression. In addition, minimizing the data transmission cost is non-trivial due to the enormous parameter space. This paper focuses on this problem and makes the following contributions: 1) We propose empirical data transmission time and monetary cost models based on serverless compression. It can also predict compression information, e.g., ratio and speed using chunk sampling and machine learning techniques. 2) For single-task cloud data transmission, we propose two efficient parameter search methods based on Sequential Quadratic Programming (SQP) and Eliminate then Divide and Conquer (EDC) with proven error upper bounds. Besides, we propose a parameter fine-tuning strategy to deal with transmission bandwidth variance. 3) Furthermore, for multi-task scenarios, a parameter search method based on dynamic programming and numerical computation is proposed. We have implemented the system called Fluid-Shuttle, which includes straggler optimization, cache optimization, and the autoscaling decompression mechanism. Finally, we evaluate the performance of Fluid-Shuttle with various workloads and applications on the real-world AWS serverless computing platform. Experimental results show that the proposed approach can improve the parameter search efficiency by over
$3\times $
compared with the state-of-art methods and achieves better parameter quality. In addition, our approach achieves higher time efficiency and lower monetary cost compared with competing cloud data transmission approaches.
期刊介绍:
The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.