Jiajie Shen;Bochun Wu;Maoyi Wang;Sai Zou;Laizhong Cui;Wei Ni
{"title":"RLDR: Reinforcement Learning-Based Fast Data Recovery in Cloud-of-Clouds Storage Systems","authors":"Jiajie Shen;Bochun Wu;Maoyi Wang;Sai Zou;Laizhong Cui;Wei Ni","doi":"10.1109/TCC.2025.3546528","DOIUrl":null,"url":null,"abstract":"Cloud-of-clouds storage systems are widely used in online applications, where user data are encrypted, encoded, and stored in multiple clouds. When some cloud nodes fail, the storage systems can reconstruct the lost data and store it in the substitute nodes. It is a challenge to reduce the latency of data recovery to ensure data reliability. In this paper, we adopt a Reinforcement Learning-based Data Recovery (RLDR) approach to reduce the regeneration time. By employing the Monte-Carlo method, our approach can construct the tree-topology-based regeneration process, a.k.a. regeneration tree, to effectively reduce the regeneration time. Through rigorous analysis, we apply the information flow graph to optimize the inter-cloud traffic for a given regeneration tree. To verify the merit of RLDR, We conduct extensive experiments on real-world traces. Experiments demonstrate that RLDR can significantly accelerate the regeneration process. Specifically, RLDR can reduce the regeneration time by up to 92% and increase the throughput by up to twelve-fold, compared to the prior art.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"526-543"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10906478/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Cloud-of-clouds storage systems are widely used in online applications, where user data are encrypted, encoded, and stored in multiple clouds. When some cloud nodes fail, the storage systems can reconstruct the lost data and store it in the substitute nodes. It is a challenge to reduce the latency of data recovery to ensure data reliability. In this paper, we adopt a Reinforcement Learning-based Data Recovery (RLDR) approach to reduce the regeneration time. By employing the Monte-Carlo method, our approach can construct the tree-topology-based regeneration process, a.k.a. regeneration tree, to effectively reduce the regeneration time. Through rigorous analysis, we apply the information flow graph to optimize the inter-cloud traffic for a given regeneration tree. To verify the merit of RLDR, We conduct extensive experiments on real-world traces. Experiments demonstrate that RLDR can significantly accelerate the regeneration process. Specifically, RLDR can reduce the regeneration time by up to 92% and increase the throughput by up to twelve-fold, compared to the prior art.
期刊介绍:
The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.