Reducing the Length of Field-Replay Based Load Testing

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2024-03-31 DOI:10.1109/TSE.2024.3408079

Yuanjie Xia;Lizhi Liao;Jinfu Chen;Heng Li;Weiyi Shang

{"title":"Reducing the Length of Field-Replay Based Load Testing","authors":"Yuanjie Xia;Lizhi Liao;Jinfu Chen;Heng Li;Weiyi Shang","doi":"10.1109/TSE.2024.3408079","DOIUrl":null,"url":null,"abstract":"As software systems continuously grow in size and complexity, performance and load related issues have become more common than functional issues. Load testing is usually performed before software releases to ensure that the software system can still provide quality service under a certain load. Therefore, one of the common challenges of load testing is to design realistic workloads that can represent the actual workload in the field. In particular, one of the most widely adopted and intuitive approaches is to directly replay the field workloads in the load testing environment. However, replaying a lengthy, e.g., 48 hours, field workloads is rather resource- and time-consuming, and sometimes even infeasible for large-scale software systems that adopt a rapid release cycle. On the other hand, replaying a short duration of the field workloads may still result in unrealistic load testing. In this work, we propose an automated approach to reduce the length of load testing that is driven by replaying the field workloads. The intuition of our approach is: if the measured performance associated with a particular system behaviour is already stable, we can skip subsequent testing of this system behaviour to reduce the length of the field workloads. In particular, our approach first clusters execution logs that are generated during the system runtime to identify similar system behaviours during the field workloads. Then, we use statistical methods to determine whether the measured performance associated with a system behaviour has been stable. We evaluate our approach on three open-source projects (i.e., \n<italic>OpenMRS</i>\n, \n<italic>TeaStore</i>\n, and \n<italic>Apache James</i>\n). The results show that our approach can significantly reduce the length of field workloads while the workloads-after-reduction produced by our approach are representative of the original set of workloads. More importantly, the load testing results obtained by replaying the workloads after the reduction have high correlation and similar trend with the original set of workloads. Practitioners can leverage our approach to perform realistic field-replay based load testing while saving the needed resources and time. Our approach sheds light on future research that aims to reduce the cost of load testing for large-scale software systems.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5000,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10543182/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

As software systems continuously grow in size and complexity, performance and load related issues have become more common than functional issues. Load testing is usually performed before software releases to ensure that the software system can still provide quality service under a certain load. Therefore, one of the common challenges of load testing is to design realistic workloads that can represent the actual workload in the field. In particular, one of the most widely adopted and intuitive approaches is to directly replay the field workloads in the load testing environment. However, replaying a lengthy, e.g., 48 hours, field workloads is rather resource- and time-consuming, and sometimes even infeasible for large-scale software systems that adopt a rapid release cycle. On the other hand, replaying a short duration of the field workloads may still result in unrealistic load testing. In this work, we propose an automated approach to reduce the length of load testing that is driven by replaying the field workloads. The intuition of our approach is: if the measured performance associated with a particular system behaviour is already stable, we can skip subsequent testing of this system behaviour to reduce the length of the field workloads. In particular, our approach first clusters execution logs that are generated during the system runtime to identify similar system behaviours during the field workloads. Then, we use statistical methods to determine whether the measured performance associated with a system behaviour has been stable. We evaluate our approach on three open-source projects (i.e., OpenMRS , TeaStore , and Apache James ). The results show that our approach can significantly reduce the length of field workloads while the workloads-after-reduction produced by our approach are representative of the original set of workloads. More importantly, the load testing results obtained by replaying the workloads after the reduction have high correlation and similar trend with the original set of workloads. Practitioners can leverage our approach to perform realistic field-replay based load testing while saving the needed resources and time. Our approach sheds light on future research that aims to reduce the cost of load testing for large-scale software systems.

查看原文本刊更多论文

缩短基于现场重放的负载测试时间

随着软件系统规模和复杂性的不断增长，性能和负载相关问题已变得比功能问题更为常见。负载测试通常在软件发布前进行，以确保软件系统在一定负载下仍能提供优质服务。因此，负载测试的常见挑战之一是设计出能代表现场实际工作负载的真实工作负载。其中，最广泛采用的直观方法之一是在负载测试环境中直接重放现场工作负载。然而，重放长时间（如 48 小时）的现场工作负载相当耗费资源和时间，对于采用快速发布周期的大型软件系统来说，有时甚至是不可行的。另一方面，重放短时间的现场工作负载仍可能导致不切实际的负载测试。在这项工作中，我们提出了一种自动化方法，以缩短通过重放现场工作负载驱动的负载测试时间。我们这种方法的直觉是：如果与特定系统行为相关的测量性能已经稳定，我们就可以跳过对该系统行为的后续测试，从而缩短现场工作负载的时间。具体来说，我们的方法首先对系统运行时生成的执行日志进行聚类，以识别现场工作负载中类似的系统行为。然后，我们使用统计方法来确定与系统行为相关的测量性能是否稳定。我们在三个开源项目（即 OpenMRS、TeaStore 和 Apache James）上对我们的方法进行了评估。结果表明，我们的方法可以显著缩短现场工作负载的长度，同时我们的方法所产生的工作负载缩减后与原始工作负载集具有代表性。更重要的是，通过重放缩减后的工作负载获得的负载测试结果与原始工作负载集具有高度相关性和相似趋势。从业人员可以利用我们的方法来执行基于现场重放的真实负载测试，同时节省所需的资源和时间。我们的方法为未来旨在降低大规模软件系统负载测试成本的研究提供了启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.