All-Spark:直接在生产环境中使用模拟测试来检测大型系统中的系统瓶颈

Jialiang Lin, Jun Zhang, Yu Ding, Liping Zhang, Yin Han
{"title":"All-Spark:直接在生产环境中使用模拟测试来检测大型系统中的系统瓶颈","authors":"Jialiang Lin, Jun Zhang, Yu Ding, Liping Zhang, Yin Han","doi":"10.1145/3274808.3274809","DOIUrl":null,"url":null,"abstract":"With the rapid growth in e-commerce, large-scale promotional activities have become a popular concept. However, when the existing system cannot be adjusted efficiently to adapt to the tremendous traffic in the promotion period, which is hundreds of times more than the volume on normal days, it be-comes a bottleneck that restricts the continuous growth of the online business. Traditional capacity prediction methods have been proven to be incapable of making accurate predictions for such special scenarios, because of a variety of unpredictable system bottlenecks. Simulation testing in a completely new test environment for such a large scale has a number of defects and limitations, such as the high cost of setting up the environment and the difficulty of testing the entire environment. Moreover, bottlenecks found in the test server may be different from those in the production server. We investigated online simulations in the production environment and built a complete simulation test system called All-Sparks. This solution solved a long-standing problem of simulation testing with large traffic in the production environment without causing any data pollution. The simulation test revealed hundreds of bottlenecks under a high workload pressure every year to eliminate the hidden problems caused by new applications. The final capacity evaluation result was deviated by less than 5% from the actual capacity, and the error rate was small (<2%); both of these are significant improvements over the traditional prediction results. This solution also provided a framework with good expansibility to multiple scenarios other than stress testing.","PeriodicalId":167957,"journal":{"name":"Proceedings of the 19th International Middleware Conference","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"All-Spark: Using Simulation Tests Directly in Production Environments to Detect System Bottlenecks in Large-Scale Systems\",\"authors\":\"Jialiang Lin, Jun Zhang, Yu Ding, Liping Zhang, Yin Han\",\"doi\":\"10.1145/3274808.3274809\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid growth in e-commerce, large-scale promotional activities have become a popular concept. However, when the existing system cannot be adjusted efficiently to adapt to the tremendous traffic in the promotion period, which is hundreds of times more than the volume on normal days, it be-comes a bottleneck that restricts the continuous growth of the online business. Traditional capacity prediction methods have been proven to be incapable of making accurate predictions for such special scenarios, because of a variety of unpredictable system bottlenecks. Simulation testing in a completely new test environment for such a large scale has a number of defects and limitations, such as the high cost of setting up the environment and the difficulty of testing the entire environment. Moreover, bottlenecks found in the test server may be different from those in the production server. We investigated online simulations in the production environment and built a complete simulation test system called All-Sparks. This solution solved a long-standing problem of simulation testing with large traffic in the production environment without causing any data pollution. The simulation test revealed hundreds of bottlenecks under a high workload pressure every year to eliminate the hidden problems caused by new applications. The final capacity evaluation result was deviated by less than 5% from the actual capacity, and the error rate was small (<2%); both of these are significant improvements over the traditional prediction results. This solution also provided a framework with good expansibility to multiple scenarios other than stress testing.\",\"PeriodicalId\":167957,\"journal\":{\"name\":\"Proceedings of the 19th International Middleware Conference\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th International Middleware Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3274808.3274809\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3274808.3274809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着电子商务的快速发展,大型促销活动已经成为一个流行的概念。然而,当现有系统无法有效调整以适应促销期间的巨大流量时,促销期间的流量是平时的数百倍,这就成为了制约在线业务持续增长的瓶颈。由于存在各种不可预测的系统瓶颈,传统的容量预测方法已被证明无法对此类特殊场景做出准确的预测。在一个全新的测试环境中进行如此大规模的模拟测试存在许多缺陷和局限性,例如设置环境的成本高,测试整个环境的难度大。此外,测试服务器中的瓶颈可能与生产服务器中的瓶颈不同。我们研究了生产环境中的在线模拟,并建立了一个完整的模拟测试系统,称为All-Sparks。该解决方案解决了在生产环境中使用大流量进行模拟测试的长期问题,而不会造成任何数据污染。模拟测试揭示了每年在高工作负载压力下的数百个瓶颈,以消除新应用程序带来的隐藏问题。最终容量评价结果与实际容量偏差小于5%,错误率较小(<2%);这两种方法都是对传统预测结果的显著改进。该解决方案还提供了一个框架,该框架具有良好的可扩展性,可用于除压力测试之外的多种场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
All-Spark: Using Simulation Tests Directly in Production Environments to Detect System Bottlenecks in Large-Scale Systems
With the rapid growth in e-commerce, large-scale promotional activities have become a popular concept. However, when the existing system cannot be adjusted efficiently to adapt to the tremendous traffic in the promotion period, which is hundreds of times more than the volume on normal days, it be-comes a bottleneck that restricts the continuous growth of the online business. Traditional capacity prediction methods have been proven to be incapable of making accurate predictions for such special scenarios, because of a variety of unpredictable system bottlenecks. Simulation testing in a completely new test environment for such a large scale has a number of defects and limitations, such as the high cost of setting up the environment and the difficulty of testing the entire environment. Moreover, bottlenecks found in the test server may be different from those in the production server. We investigated online simulations in the production environment and built a complete simulation test system called All-Sparks. This solution solved a long-standing problem of simulation testing with large traffic in the production environment without causing any data pollution. The simulation test revealed hundreds of bottlenecks under a high workload pressure every year to eliminate the hidden problems caused by new applications. The final capacity evaluation result was deviated by less than 5% from the actual capacity, and the error rate was small (<2%); both of these are significant improvements over the traditional prediction results. This solution also provided a framework with good expansibility to multiple scenarios other than stress testing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信