Bottleneck identification and failure prevention with procedural learning in 5G RAN

Tobias Sundqvist, M. Bhuyan, E. Elmroth
{"title":"Bottleneck identification and failure prevention with procedural learning in 5G RAN","authors":"Tobias Sundqvist, M. Bhuyan, E. Elmroth","doi":"10.1109/CCGrid57682.2023.00047","DOIUrl":null,"url":null,"abstract":"To meet the low latency requirements of 5G Radio Access Networks (RAN), it is essential to learn where performance bottlenecks occur. As parts are distributed and virtualized, it becomes troublesome to identify where unwanted delays occur. Today, vendors spend huge manual effort analyzing key performance indicators (KPIs) and system logs to detect these bottlenecks. The 5G architecture allows a flexible scaling of microservices to handle the variation in traffic. But knowing how, when, and where to scale is difficult without a detailed latency analysis. In this article, we propose a novel method that combines procedural learning with latency analysis of system log events. The method, which we call LogGenie, learns the latency pattern of the system at different load scenarios and automatically identifies the parts with the most significant increase in latency. Our evaluation in an advanced 5G testbed shows that LogGenie can provide a more detailed analysis than previous research has achieved and help troubleshooters locate bottlenecks faster. Finally, through experiments, we show how a latency prediction model can dynamically fine-tune the behavior where bottlenecks occur. This lowers resource utilization, makes the architecture more flexible, and allows the system to fulfill its latency requirements.","PeriodicalId":363806,"journal":{"name":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid57682.2023.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

To meet the low latency requirements of 5G Radio Access Networks (RAN), it is essential to learn where performance bottlenecks occur. As parts are distributed and virtualized, it becomes troublesome to identify where unwanted delays occur. Today, vendors spend huge manual effort analyzing key performance indicators (KPIs) and system logs to detect these bottlenecks. The 5G architecture allows a flexible scaling of microservices to handle the variation in traffic. But knowing how, when, and where to scale is difficult without a detailed latency analysis. In this article, we propose a novel method that combines procedural learning with latency analysis of system log events. The method, which we call LogGenie, learns the latency pattern of the system at different load scenarios and automatically identifies the parts with the most significant increase in latency. Our evaluation in an advanced 5G testbed shows that LogGenie can provide a more detailed analysis than previous research has achieved and help troubleshooters locate bottlenecks faster. Finally, through experiments, we show how a latency prediction model can dynamically fine-tune the behavior where bottlenecks occur. This lowers resource utilization, makes the architecture more flexible, and allows the system to fulfill its latency requirements.
基于程序学习的5G无线局域网瓶颈识别与故障预防
为了满足5G无线接入网络(RAN)的低延迟要求,必须了解性能瓶颈发生在哪里。由于部件是分布式和虚拟化的,因此确定不需要的延迟发生在哪里变得很麻烦。今天,供应商花费大量的手工工作来分析关键性能指标(kpi)和系统日志,以检测这些瓶颈。5G架构允许灵活扩展微服务来处理流量的变化。但是,如果没有详细的延迟分析,就很难知道如何、何时以及在何处进行扩展。在本文中,我们提出了一种将过程学习与系统日志事件的延迟分析相结合的新方法。我们称之为LogGenie的方法可以学习系统在不同负载情况下的延迟模式,并自动识别延迟增加最显著的部分。我们在先进的5G测试平台上的评估表明,LogGenie可以提供比以前研究更详细的分析,并帮助故障排除人员更快地找到瓶颈。最后,通过实验,我们展示了延迟预测模型如何动态微调瓶颈发生时的行为。这降低了资源利用率,使体系结构更加灵活,并允许系统满足其延迟需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信