{"title":"Domain Tailored Large Language Models for Log Mask Prediction in Cellular Network Diagnostics","authors":"Sayed Taheri;Achintha Ihalage;Prateek Mishra;Sean Coaker;Faris Muhammad;Hamed Al-Raweshidy","doi":"10.1109/TNSM.2025.3541384","DOIUrl":null,"url":null,"abstract":"Software logs generated by dedicated network testing hardware are often complex and bear minimal similarity to natural language, requiring the expertise of engineers to understand and capture defects recorded in these logs. This manual process is inefficient and expensive for both service providers and their clients. In this study, we demonstrate the transformative potential of Artificial Intelligence (AI), specifically through domain-tailoring of Large Language Models (LLMs) like RoBERTa, BigBird, and Flan-T5, to streamline the process of defect diagnostics. Particularly, we pre-train these models ground up on a real industrial telecommunications log corpus, and perform finetuning on a multi-label classification objective. This facilitates identifying a correct set of log points to be enabled for rapid detection of defects that arise during network testing. Despite encountering several challenges such as intricate text structures, heavily skewed label distribution, and inconsistencies in historical data labelling, our tailored LLMs achieve commendable performance on previously unseen defect cases, significantly reducing the turnaround times. This research not only serves as an exemplar for adapting LLMs in telecommunications industry for automated defect diagnostics, but also has wide implications for software log analysis across various industries.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 3","pages":"2370-2381"},"PeriodicalIF":5.4000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10891042/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Software logs generated by dedicated network testing hardware are often complex and bear minimal similarity to natural language, requiring the expertise of engineers to understand and capture defects recorded in these logs. This manual process is inefficient and expensive for both service providers and their clients. In this study, we demonstrate the transformative potential of Artificial Intelligence (AI), specifically through domain-tailoring of Large Language Models (LLMs) like RoBERTa, BigBird, and Flan-T5, to streamline the process of defect diagnostics. Particularly, we pre-train these models ground up on a real industrial telecommunications log corpus, and perform finetuning on a multi-label classification objective. This facilitates identifying a correct set of log points to be enabled for rapid detection of defects that arise during network testing. Despite encountering several challenges such as intricate text structures, heavily skewed label distribution, and inconsistencies in historical data labelling, our tailored LLMs achieve commendable performance on previously unseen defect cases, significantly reducing the turnaround times. This research not only serves as an exemplar for adapting LLMs in telecommunications industry for automated defect diagnostics, but also has wide implications for software log analysis across various industries.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.