VulMPFF: A Vulnerability Detection Method for Fusing Code Features in Multiple Perspectives

IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Xiansheng Cao, Junfeng Wang, Peng Wu, Zhiyang Fang
{"title":"VulMPFF: A Vulnerability Detection Method for Fusing Code Features in Multiple Perspectives","authors":"Xiansheng Cao,&nbsp;Junfeng Wang,&nbsp;Peng Wu,&nbsp;Zhiyang Fang","doi":"10.1049/2024/4313185","DOIUrl":null,"url":null,"abstract":"<div>\n <p>Source code vulnerabilities are one of the significant threats to software security. Existing deep learning-based detection methods have proven their effectiveness. However, most of them extract code information on a single intermediate representation of code (IRC), which often fails to extract multiple information hidden in the code fully, significantly limiting their performance. To address this problem, we propose VulMPFF, a vulnerability detection method that fuses code features under multiple perspectives. It extracts IRC from three perspectives: code sequence, lexical and syntactic relations, and graph structure to capture the vulnerability information in the code, which effectively realizes the complementary information of multiple IRCs and improves vulnerability detection performance. Specifically, VulMPFF extracts serialized abstract syntax tree as IRC from code sequence, lexical and syntactic relation perspective, and code property graph as IRC from graph structure perspective, and uses Bi-LSTM model with attention mechanism and graph neural network with attention mechanism to learn the code features from multiple perspectives and fuse them to detect the vulnerabilities in the code, respectively. We design a dual-attention mechanism to highlight critical code information for vulnerability triggering and better accomplish the vulnerability detection task. We evaluate our approach on three datasets. Experiments show that VulMPFF outperforms existing state-of-the-art vulnerability detection methods (i.e., Rats, FlawFinder, VulDeePecker, SySeVR, Devign, and Reveal) in Acc and F1 score, with improvements ranging from 14.71% to 145.78% and 152.08% to 344.77%, respectively. Meanwhile, experiments in the open-source project demonstrate that VulMPFF has the potential to detect vulnerabilities in real-world environments.</p>\n </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2024 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4313185","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Information Security","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/2024/4313185","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Source code vulnerabilities are one of the significant threats to software security. Existing deep learning-based detection methods have proven their effectiveness. However, most of them extract code information on a single intermediate representation of code (IRC), which often fails to extract multiple information hidden in the code fully, significantly limiting their performance. To address this problem, we propose VulMPFF, a vulnerability detection method that fuses code features under multiple perspectives. It extracts IRC from three perspectives: code sequence, lexical and syntactic relations, and graph structure to capture the vulnerability information in the code, which effectively realizes the complementary information of multiple IRCs and improves vulnerability detection performance. Specifically, VulMPFF extracts serialized abstract syntax tree as IRC from code sequence, lexical and syntactic relation perspective, and code property graph as IRC from graph structure perspective, and uses Bi-LSTM model with attention mechanism and graph neural network with attention mechanism to learn the code features from multiple perspectives and fuse them to detect the vulnerabilities in the code, respectively. We design a dual-attention mechanism to highlight critical code information for vulnerability triggering and better accomplish the vulnerability detection task. We evaluate our approach on three datasets. Experiments show that VulMPFF outperforms existing state-of-the-art vulnerability detection methods (i.e., Rats, FlawFinder, VulDeePecker, SySeVR, Devign, and Reveal) in Acc and F1 score, with improvements ranging from 14.71% to 145.78% and 152.08% to 344.77%, respectively. Meanwhile, experiments in the open-source project demonstrate that VulMPFF has the potential to detect vulnerabilities in real-world environments.

Abstract Image

VulMPFF:多角度融合代码特征的漏洞检测方法
源代码漏洞是软件安全的重大威胁之一。现有的基于深度学习的检测方法已经证明了其有效性。然而,大多数方法都是在单一的代码中间表示(IRC)上提取代码信息,往往不能完全提取隐藏在代码中的多种信息,大大限制了其性能。为了解决这个问题,我们提出了 VulMPFF,一种在多个视角下融合代码特征的漏洞检测方法。它从代码序列、词法和句法关系、图结构三个角度提取 IRC,捕捉代码中的漏洞信息,有效实现了多个 IRC 信息的互补,提高了漏洞检测性能。具体来说,VulMPFF 从代码序列、词法和句法关系角度提取序列化抽象语法树作为 IRC,从图结构角度提取代码属性图作为 IRC,并分别使用具有注意机制的 Bi-LSTM 模型和具有注意机制的图神经网络来学习多个角度的代码特征,并将其融合在一起检测代码中的漏洞。我们设计了一种双重关注机制,以突出用于触发漏洞的关键代码信息,从而更好地完成漏洞检测任务。我们在三个数据集上评估了我们的方法。实验表明,VulMPFF 在 Acc 和 F1 分数上优于现有的一流漏洞检测方法(即 Rats、FlawFinder、VulDeePecker、SySeVR、Devign 和 Reveal),分别提高了 14.71% 到 145.78%,以及 152.08% 到 344.77%。同时,开源项目的实验证明,VulMPFF 具有在真实世界环境中检测漏洞的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IET Information Security
IET Information Security 工程技术-计算机:理论方法
CiteScore
3.80
自引率
7.10%
发文量
47
审稿时长
8.6 months
期刊介绍: IET Information Security publishes original research papers in the following areas of information security and cryptography. Submitting authors should specify clearly in their covering statement the area into which their paper falls. Scope: Access Control and Database Security Ad-Hoc Network Aspects Anonymity and E-Voting Authentication Block Ciphers and Hash Functions Blockchain, Bitcoin (Technical aspects only) Broadcast Encryption and Traitor Tracing Combinatorial Aspects Covert Channels and Information Flow Critical Infrastructures Cryptanalysis Dependability Digital Rights Management Digital Signature Schemes Digital Steganography Economic Aspects of Information Security Elliptic Curve Cryptography and Number Theory Embedded Systems Aspects Embedded Systems Security and Forensics Financial Cryptography Firewall Security Formal Methods and Security Verification Human Aspects Information Warfare and Survivability Intrusion Detection Java and XML Security Key Distribution Key Management Malware Multi-Party Computation and Threshold Cryptography Peer-to-peer Security PKIs Public-Key and Hybrid Encryption Quantum Cryptography Risks of using Computers Robust Networks Secret Sharing Secure Electronic Commerce Software Obfuscation Stream Ciphers Trust Models Watermarking and Fingerprinting Special Issues. Current Call for Papers: Security on Mobile and IoT devices - https://digital-library.theiet.org/files/IET_IFS_SMID_CFP.pdf
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信