{"title":"orsight -2.0: O-RAN的基础法学硕士","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":null,"url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":"0","resultStr":"{\"title\":\"ORANSight-2.0: Foundational LLMs for O-RAN\",\"authors\":\"Pranshav Gajjar;Vijay K. Shah\",\"doi\":\"10.1109/TMLCN.2025.3592658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.\",\"PeriodicalId\":100641,\"journal\":{\"name\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"volume\":\"3 \",\"pages\":\"903-920\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11096935/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11096935/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.