{"title":"Toward Addressing Training Data Scarcity Challenge in Emerging Radio Access Networks: A Survey and Framework","authors":"Haneya Naeem Qureshi;Usama Masood;Marvin Manalastas;Syed Muhammad Asad Zaidi;Hasan Farooq;Julien Forgeat;Maxime Bouton;Shruti Bothe;Per Karlsson;Ali Rizwan;Ali Imran","doi":"10.1109/COMST.2023.3271419","DOIUrl":null,"url":null,"abstract":"The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature to leverage AI for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of the proposed AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of the training data. In this paper, we present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques used in literature to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The proposed framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, few-shot learning, simulators and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different types of network geometries and network parameters. In addition, an overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms to access real data in order to overcome the data scarcity challenge. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing the appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation.","PeriodicalId":55029,"journal":{"name":"IEEE Communications Surveys and Tutorials","volume":"25 3","pages":"1954-1990"},"PeriodicalIF":34.4000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9739/10226436/10113782.pdf","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Communications Surveys and Tutorials","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10113782/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 3
Abstract
The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature to leverage AI for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of the proposed AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of the training data. In this paper, we present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques used in literature to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The proposed framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, few-shot learning, simulators and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different types of network geometries and network parameters. In addition, an overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms to access real data in order to overcome the data scarcity challenge. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing the appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation.
期刊介绍:
IEEE Communications Surveys & Tutorials is an online journal published by the IEEE Communications Society for tutorials and surveys covering all aspects of the communications field. Telecommunications technology is progressing at a rapid pace, and the IEEE Communications Society is committed to providing researchers and other professionals the information and tools to stay abreast. IEEE Communications Surveys and Tutorials focuses on integrating and adding understanding to the existing literature on communications, putting results in context. Whether searching for in-depth information about a familiar area or an introduction into a new area, IEEE Communications Surveys & Tutorials aims to be the premier source of peer-reviewed, comprehensive tutorials and surveys, and pointers to further sources. IEEE Communications Surveys & Tutorials publishes only articles exclusively written for IEEE Communications Surveys & Tutorials and go through a rigorous review process before their publication in the quarterly issues.
A tutorial article in the IEEE Communications Surveys & Tutorials should be designed to help the reader to become familiar with and learn something specific about a chosen topic. In contrast, the term survey, as applied here, is defined to mean a survey of the literature. A survey article in IEEE Communications Surveys & Tutorials should provide a comprehensive review of developments in a selected area, covering its development from its inception to its current state and beyond, and illustrating its development through liberal citations from the literature. Both tutorials and surveys should be tutorial in nature and should be written in a style comprehensible to readers outside the specialty of the article.