Tiangang Li;Shi Ying;Xiangbo Tian;Ting Zhang;Yong Wang
{"title":"用于云系统中自动缩放的对抗性模拟到真实转移强化学习","authors":"Tiangang Li;Shi Ying;Xiangbo Tian;Ting Zhang;Yong Wang","doi":"10.1109/TSE.2025.3603995","DOIUrl":null,"url":null,"abstract":"With the widespread adoption of cloud computing, autoscaling has become crucial for efficient resource management and stable service provision in cloud systems. In recent years, autoscaling methods based on deep reinforcement learning (DRL) have gained significant attention due to their outstanding adaptability and flexibility. However, training DRL-based autoscaler requires interactions with real cloud systems, incurring high interaction costs, low data collection efficiency, and potential operational impacts. To address these challenges, we propose ASTRA, a sim-to-real transfer reinforcement learning framework for autoscaling. ASTRA constructs a cloud system simulation environment based on a performance estimation model, enabling low-cost and high-efficiency training sample collection for policy learning. The learned policy is subsequently transferred to the real systems for scaling decisions. To address performance modeling inaccuracies caused by dynamic cloud state changes, we propose a performance modeling method based on hybrid attentive state space model. By incorporating state space model, it captures system dynamics and state evolution, effectively reducing simulation errors. Furthermore, to mitigate the performance degradation of the transferred policy due to the distribution shift, we propose an autoscaling method based on adversarial soft actor-critic. By introducing adversarial policy training with gradient regularization based on state perturbations, it significantly improves transferred policy performance. The results in the real system demonstrate that ASTRA achieves optimal overall performance in environment modeling, policy transfer and real-world autoscaling. Specifically, ASTRA outperforms all baselines in terms of instance number, response time, SLO violation rate, and CPU utilization under different workload patterns. More importantly, under limited interaction costs, ASTRA achieves a 616.94<inline-formula><tex-math>$\\times$</tex-math></inline-formula> improvement in interaction sample collection rate compared to direct online training method.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 10","pages":"2921-2941"},"PeriodicalIF":5.6000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ASTRA: Adversarial Sim-to-Real Transfer Reinforcement Learning for Autoscaling in Cloud Systems\",\"authors\":\"Tiangang Li;Shi Ying;Xiangbo Tian;Ting Zhang;Yong Wang\",\"doi\":\"10.1109/TSE.2025.3603995\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the widespread adoption of cloud computing, autoscaling has become crucial for efficient resource management and stable service provision in cloud systems. In recent years, autoscaling methods based on deep reinforcement learning (DRL) have gained significant attention due to their outstanding adaptability and flexibility. However, training DRL-based autoscaler requires interactions with real cloud systems, incurring high interaction costs, low data collection efficiency, and potential operational impacts. To address these challenges, we propose ASTRA, a sim-to-real transfer reinforcement learning framework for autoscaling. ASTRA constructs a cloud system simulation environment based on a performance estimation model, enabling low-cost and high-efficiency training sample collection for policy learning. The learned policy is subsequently transferred to the real systems for scaling decisions. To address performance modeling inaccuracies caused by dynamic cloud state changes, we propose a performance modeling method based on hybrid attentive state space model. By incorporating state space model, it captures system dynamics and state evolution, effectively reducing simulation errors. Furthermore, to mitigate the performance degradation of the transferred policy due to the distribution shift, we propose an autoscaling method based on adversarial soft actor-critic. By introducing adversarial policy training with gradient regularization based on state perturbations, it significantly improves transferred policy performance. The results in the real system demonstrate that ASTRA achieves optimal overall performance in environment modeling, policy transfer and real-world autoscaling. Specifically, ASTRA outperforms all baselines in terms of instance number, response time, SLO violation rate, and CPU utilization under different workload patterns. More importantly, under limited interaction costs, ASTRA achieves a 616.94<inline-formula><tex-math>$\\\\times$</tex-math></inline-formula> improvement in interaction sample collection rate compared to direct online training method.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 10\",\"pages\":\"2921-2941\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11145190/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11145190/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
ASTRA: Adversarial Sim-to-Real Transfer Reinforcement Learning for Autoscaling in Cloud Systems
With the widespread adoption of cloud computing, autoscaling has become crucial for efficient resource management and stable service provision in cloud systems. In recent years, autoscaling methods based on deep reinforcement learning (DRL) have gained significant attention due to their outstanding adaptability and flexibility. However, training DRL-based autoscaler requires interactions with real cloud systems, incurring high interaction costs, low data collection efficiency, and potential operational impacts. To address these challenges, we propose ASTRA, a sim-to-real transfer reinforcement learning framework for autoscaling. ASTRA constructs a cloud system simulation environment based on a performance estimation model, enabling low-cost and high-efficiency training sample collection for policy learning. The learned policy is subsequently transferred to the real systems for scaling decisions. To address performance modeling inaccuracies caused by dynamic cloud state changes, we propose a performance modeling method based on hybrid attentive state space model. By incorporating state space model, it captures system dynamics and state evolution, effectively reducing simulation errors. Furthermore, to mitigate the performance degradation of the transferred policy due to the distribution shift, we propose an autoscaling method based on adversarial soft actor-critic. By introducing adversarial policy training with gradient regularization based on state perturbations, it significantly improves transferred policy performance. The results in the real system demonstrate that ASTRA achieves optimal overall performance in environment modeling, policy transfer and real-world autoscaling. Specifically, ASTRA outperforms all baselines in terms of instance number, response time, SLO violation rate, and CPU utilization under different workload patterns. More importantly, under limited interaction costs, ASTRA achieves a 616.94$\times$ improvement in interaction sample collection rate compared to direct online training method.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.