M. Nandish;Jalesh Kumar;H. G. Mohan;M. V. Manoj Kumar
{"title":"使用操作码序列的物联网固件二进制文件中基于变压器的漏洞检测","authors":"M. Nandish;Jalesh Kumar;H. G. Mohan;M. V. Manoj Kumar","doi":"10.1109/ACCESS.2025.3588950","DOIUrl":null,"url":null,"abstract":"Firmware security is critical for maintaining the integrity of embedded systems. However, detecting vulnerabilities in firmware binaries is a challenging task. This is due to the absence of source code, the inherent complexity of binary structures, the diversity of hardware architecture, and the difficulty of extracting deep contextual representations from binaries. In the proposed approach, the Decoding-enhanced BERT with Disentangled Attention (DeBERTa), a novel transformer-based model is used to detect vulnerabilities in firmware binaries. Initially, firmware binaries are disassembled to extract opcode sequences, which are then tokenized and encoded as inputs to the proposed DeBERTa model. The model processes instruction opcode sequences and generates meaningful embeddings, which are utilized for classification tasks. The classifiers used in the proposed approach are Random Forest, Multi-Layer Perceptron, and GAN-based classifier, which operate on the DeBERTa-generated embeddings. The proposed model learns deep contextual representations of firmware code, effectively capturing intricate syntactic and semantic relationships. The evaluation is conducted on IoT firmware binaries collected from real-world IoT projects, reflecting practical and diverse vulnerability scenarios. Experimental results demonstrate that the proposed DeBERTa-based model achieves 97% accuracy, 97% recall, and 94.6% F1-score, outperforming conventional embedding techniques. The experimental findings demonstrate that the opcode sequence feature effectively and reliably detects different types of vulnerable and benign IoT samples.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"124250-124263"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11080410","citationCount":"0","resultStr":"{\"title\":\"Transformer-Based Vulnerability Detection in IoT Firmware Binaries Using Opcode Sequences\",\"authors\":\"M. Nandish;Jalesh Kumar;H. G. Mohan;M. V. Manoj Kumar\",\"doi\":\"10.1109/ACCESS.2025.3588950\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Firmware security is critical for maintaining the integrity of embedded systems. However, detecting vulnerabilities in firmware binaries is a challenging task. This is due to the absence of source code, the inherent complexity of binary structures, the diversity of hardware architecture, and the difficulty of extracting deep contextual representations from binaries. In the proposed approach, the Decoding-enhanced BERT with Disentangled Attention (DeBERTa), a novel transformer-based model is used to detect vulnerabilities in firmware binaries. Initially, firmware binaries are disassembled to extract opcode sequences, which are then tokenized and encoded as inputs to the proposed DeBERTa model. The model processes instruction opcode sequences and generates meaningful embeddings, which are utilized for classification tasks. The classifiers used in the proposed approach are Random Forest, Multi-Layer Perceptron, and GAN-based classifier, which operate on the DeBERTa-generated embeddings. The proposed model learns deep contextual representations of firmware code, effectively capturing intricate syntactic and semantic relationships. The evaluation is conducted on IoT firmware binaries collected from real-world IoT projects, reflecting practical and diverse vulnerability scenarios. Experimental results demonstrate that the proposed DeBERTa-based model achieves 97% accuracy, 97% recall, and 94.6% F1-score, outperforming conventional embedding techniques. The experimental findings demonstrate that the opcode sequence feature effectively and reliably detects different types of vulnerable and benign IoT samples.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"13 \",\"pages\":\"124250-124263\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11080410\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11080410/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11080410/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Transformer-Based Vulnerability Detection in IoT Firmware Binaries Using Opcode Sequences
Firmware security is critical for maintaining the integrity of embedded systems. However, detecting vulnerabilities in firmware binaries is a challenging task. This is due to the absence of source code, the inherent complexity of binary structures, the diversity of hardware architecture, and the difficulty of extracting deep contextual representations from binaries. In the proposed approach, the Decoding-enhanced BERT with Disentangled Attention (DeBERTa), a novel transformer-based model is used to detect vulnerabilities in firmware binaries. Initially, firmware binaries are disassembled to extract opcode sequences, which are then tokenized and encoded as inputs to the proposed DeBERTa model. The model processes instruction opcode sequences and generates meaningful embeddings, which are utilized for classification tasks. The classifiers used in the proposed approach are Random Forest, Multi-Layer Perceptron, and GAN-based classifier, which operate on the DeBERTa-generated embeddings. The proposed model learns deep contextual representations of firmware code, effectively capturing intricate syntactic and semantic relationships. The evaluation is conducted on IoT firmware binaries collected from real-world IoT projects, reflecting practical and diverse vulnerability scenarios. Experimental results demonstrate that the proposed DeBERTa-based model achieves 97% accuracy, 97% recall, and 94.6% F1-score, outperforming conventional embedding techniques. The experimental findings demonstrate that the opcode sequence feature effectively and reliably detects different types of vulnerable and benign IoT samples.
IEEE AccessCOMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍:
IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest.
IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on:
Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals.
Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering.
Development of new or improved fabrication or manufacturing techniques.
Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.