Zhongyuan Qin , Jiaqi Chen , Xin Sun , Yubo Song , Hua Dai , Weiwei Chen , Bang Lv , Kanghui Wang
{"title":"Directed grey box fuzzy testing for power terminal device firmware with intermediate representation similarity comparison","authors":"Zhongyuan Qin , Jiaqi Chen , Xin Sun , Yubo Song , Hua Dai , Weiwei Chen , Bang Lv , Kanghui Wang","doi":"10.1016/j.jisa.2025.104038","DOIUrl":null,"url":null,"abstract":"<div><div>The proliferation of heterogeneous devices in power IoT terminals significantly increases security risks due to firmware vulnerabilities, thereby threatening the stability and reliability of power systems. However, existing Directed Greybox Fuzzing (DGF) methods face challenges, such as the need for manual identification of vulnerable code and limitations to specific architectures. This paper proposes a DGF approach, guided by intermediate representation similarity comparison, comprising two main components: objective function localization and directed greybox fuzzing. In the objective function localization phase, support for multiple architectures is achieved by lifting the binary code to LLVM Intermediate Representation (IR). Given that functions may vary in both structure and semantics, we represent functions using both structural and semantic features. We employ word embedding techniques based on Natural Language Processing (NLP) and graph neural network models to construct feature vectors. By calculating the feature similarity between each function and known vulnerable functions, we automatically identify highly similar functions as targets. In the directed greybox fuzzing phase, to address issues like high false positive rates and unreachable targets, we designed a target scheduling mechanism. This mechanism permanently blocks targets that have been sufficiently covered and periodically blocks those that have not been covered, thereby further improving the efficiency of fuzzing. Experimental results on two datasets demonstrate the effectiveness of this method in identifying vulnerabilities in power terminal equipment.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"90 ","pages":"Article 104038"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625000766","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The proliferation of heterogeneous devices in power IoT terminals significantly increases security risks due to firmware vulnerabilities, thereby threatening the stability and reliability of power systems. However, existing Directed Greybox Fuzzing (DGF) methods face challenges, such as the need for manual identification of vulnerable code and limitations to specific architectures. This paper proposes a DGF approach, guided by intermediate representation similarity comparison, comprising two main components: objective function localization and directed greybox fuzzing. In the objective function localization phase, support for multiple architectures is achieved by lifting the binary code to LLVM Intermediate Representation (IR). Given that functions may vary in both structure and semantics, we represent functions using both structural and semantic features. We employ word embedding techniques based on Natural Language Processing (NLP) and graph neural network models to construct feature vectors. By calculating the feature similarity between each function and known vulnerable functions, we automatically identify highly similar functions as targets. In the directed greybox fuzzing phase, to address issues like high false positive rates and unreachable targets, we designed a target scheduling mechanism. This mechanism permanently blocks targets that have been sufficiently covered and periodically blocks those that have not been covered, thereby further improving the efficiency of fuzzing. Experimental results on two datasets demonstrate the effectiveness of this method in identifying vulnerabilities in power terminal equipment.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.