Software vulnerability detection method based on code attribute graph presentation and Bi-LSTM neural network extraction

Hanqing Jiang, Shaopei Ji, Chengchao Zha, Yanhong Liu
{"title":"Software vulnerability detection method based on code attribute graph presentation and Bi-LSTM neural network extraction","authors":"Hanqing Jiang, Shaopei Ji, Chengchao Zha, Yanhong Liu","doi":"10.1117/12.3032032","DOIUrl":null,"url":null,"abstract":"Nowadays, the scale of software is getting larger and more complex. The forms of vulnerability also tend to be more diversified. Traditional vulnerability detection methods have the disadvantages of high manual participation and weak ability to detect unknown vulnerabilities. It can no longer meet the detection requirements of diversified vulnerabilities. In order to improve the detection effect of unknown vulnerabilities, A large number of machine learning methods have been applied to the field of software vulnerability detection. Because the existing methods have high loss of syntax and semantic information in the process of code representation, Lead to high false alarm rate and false alarm rate. To solve this problem, this paper presents a software vulnerability detection method based on code attribute graph and Bi-LSTM (Long Short-Term Memory), in which abstract syntax tree sequence and control flow graph sequence are extracted from the code attribute graph of function, Reduce the loss of information in the process of code representation, Bi-LSTM is selected to build a feature extraction model, Experimental results show that, compared with the method based on abstract syntax tree, this method can greatly improve the accuracy and recall of vulnerability detection, improve the vulnerability detection effect for real data sets mixed with multiple software source codes, and effectively reduce the false positive rate and false negative rate.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3032032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Nowadays, the scale of software is getting larger and more complex. The forms of vulnerability also tend to be more diversified. Traditional vulnerability detection methods have the disadvantages of high manual participation and weak ability to detect unknown vulnerabilities. It can no longer meet the detection requirements of diversified vulnerabilities. In order to improve the detection effect of unknown vulnerabilities, A large number of machine learning methods have been applied to the field of software vulnerability detection. Because the existing methods have high loss of syntax and semantic information in the process of code representation, Lead to high false alarm rate and false alarm rate. To solve this problem, this paper presents a software vulnerability detection method based on code attribute graph and Bi-LSTM (Long Short-Term Memory), in which abstract syntax tree sequence and control flow graph sequence are extracted from the code attribute graph of function, Reduce the loss of information in the process of code representation, Bi-LSTM is selected to build a feature extraction model, Experimental results show that, compared with the method based on abstract syntax tree, this method can greatly improve the accuracy and recall of vulnerability detection, improve the vulnerability detection effect for real data sets mixed with multiple software source codes, and effectively reduce the false positive rate and false negative rate.
基于代码属性图展示和 Bi-LSTM 神经网络提取的软件漏洞检测方法
如今,软件的规模越来越大,也越来越复杂。漏洞的形式也趋于多样化。传统的漏洞检测方法存在人工参与度高、对未知漏洞的检测能力弱等缺点。已无法满足多样化漏洞的检测要求。为了提高未知漏洞的检测效果,大量机器学习方法被应用到软件漏洞检测领域。由于现有方法在代码表示过程中语法和语义信息丢失较多,导致误报率和误判率较高。为了解决这一问题,本文提出了一种基于代码属性图和 Bi-LSTM (长短期记忆)的软件漏洞检测方法,即从函数的代码属性图中提取抽象语法树序列和控制流图序列,减少代码表示过程中的信息损失、实验结果表明,与基于抽象语法树的方法相比,该方法能极大地提高漏洞检测的准确率和召回率,改善混合了多个软件源代码的真实数据集的漏洞检测效果,有效降低假阳性率和假阴性率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信