Directed grey box fuzzy testing for power terminal device firmware with intermediate representation similarity comparison

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Information Security and Applications Pub Date : 2025-03-29 DOI:10.1016/j.jisa.2025.104038

Zhongyuan Qin , Jiaqi Chen , Xin Sun , Yubo Song , Hua Dai , Weiwei Chen , Bang Lv , Kanghui Wang

{"title":"Directed grey box fuzzy testing for power terminal device firmware with intermediate representation similarity comparison","authors":"Zhongyuan Qin , Jiaqi Chen , Xin Sun , Yubo Song , Hua Dai , Weiwei Chen , Bang Lv , Kanghui Wang","doi":"10.1016/j.jisa.2025.104038","DOIUrl":null,"url":null,"abstract":"<div><div>The proliferation of heterogeneous devices in power IoT terminals significantly increases security risks due to firmware vulnerabilities, thereby threatening the stability and reliability of power systems. However, existing Directed Greybox Fuzzing (DGF) methods face challenges, such as the need for manual identification of vulnerable code and limitations to specific architectures. This paper proposes a DGF approach, guided by intermediate representation similarity comparison, comprising two main components: objective function localization and directed greybox fuzzing. In the objective function localization phase, support for multiple architectures is achieved by lifting the binary code to LLVM Intermediate Representation (IR). Given that functions may vary in both structure and semantics, we represent functions using both structural and semantic features. We employ word embedding techniques based on Natural Language Processing (NLP) and graph neural network models to construct feature vectors. By calculating the feature similarity between each function and known vulnerable functions, we automatically identify highly similar functions as targets. In the directed greybox fuzzing phase, to address issues like high false positive rates and unreachable targets, we designed a target scheduling mechanism. This mechanism permanently blocks targets that have been sufficiently covered and periodically blocks those that have not been covered, thereby further improving the efficiency of fuzzing. Experimental results on two datasets demonstrate the effectiveness of this method in identifying vulnerabilities in power terminal equipment.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"90 ","pages":"Article 104038"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625000766","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The proliferation of heterogeneous devices in power IoT terminals significantly increases security risks due to firmware vulnerabilities, thereby threatening the stability and reliability of power systems. However, existing Directed Greybox Fuzzing (DGF) methods face challenges, such as the need for manual identification of vulnerable code and limitations to specific architectures. This paper proposes a DGF approach, guided by intermediate representation similarity comparison, comprising two main components: objective function localization and directed greybox fuzzing. In the objective function localization phase, support for multiple architectures is achieved by lifting the binary code to LLVM Intermediate Representation (IR). Given that functions may vary in both structure and semantics, we represent functions using both structural and semantic features. We employ word embedding techniques based on Natural Language Processing (NLP) and graph neural network models to construct feature vectors. By calculating the feature similarity between each function and known vulnerable functions, we automatically identify highly similar functions as targets. In the directed greybox fuzzing phase, to address issues like high false positive rates and unreachable targets, we designed a target scheduling mechanism. This mechanism permanently blocks targets that have been sufficiently covered and periodically blocks those that have not been covered, thereby further improving the efficiency of fuzzing. Experimental results on two datasets demonstrate the effectiveness of this method in identifying vulnerabilities in power terminal equipment.

查看原文本刊更多论文

基于中间表示相似度比较的电源终端设备固件定向灰盒模糊测试

电力物联网终端中异构设备的激增，使固件漏洞带来的安全风险显著增加，从而威胁到电力系统的稳定性和可靠性。然而，现有的定向灰盒模糊（DGF）方法面临着挑战，例如需要手动识别易受攻击的代码和对特定架构的限制。本文提出了一种以中间表示相似度比较为指导的DGF方法，该方法由目标函数定位和定向灰盒模糊两个主要部分组成。在目标函数定位阶段，通过将二进制代码提升到LLVM中间表示（IR）来实现对多种体系结构的支持。考虑到函数可能在结构和语义上都不同，我们使用结构和语义特征来表示函数。我们采用基于自然语言处理（NLP）和图神经网络模型的词嵌入技术来构建特征向量。通过计算每个函数与已知脆弱函数之间的特征相似度，自动识别高度相似的函数作为目标。在定向灰盒模糊阶段，为了解决误报率高、目标不可达等问题，我们设计了目标调度机制。该机制永久性地屏蔽已充分覆盖的目标，周期性地屏蔽未覆盖的目标，从而进一步提高模糊检测的效率。在两个数据集上的实验结果验证了该方法在电力终端设备漏洞识别中的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Information Security and Applications Computer Science-Computer Networks and Communications

CiteScore

10.90

自引率

5.40%

发文量

206

审稿时长

56 days

期刊介绍： Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.