DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detection

2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE) Pub Date : 2023-05-01 DOI:10.1109/ICSE48619.2023.00189

Wenbo Wang, Tien N. Nguyen, Shaohua Wang, Yi Li, Jiyuan Zhang, Aashish Yadavally

{"title":"DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detection","authors":"Wenbo Wang, Tien N. Nguyen, Shaohua Wang, Yi Li, Jiyuan Zhang, Aashish Yadavally","doi":"10.1109/ICSE48619.2023.00189","DOIUrl":null,"url":null,"abstract":"The advances of machine learning (ML) including deep learning (DL) have enabled several approaches to implicitly learn vulnerable code patterns to automatically detect software vulnerabilities. A recent study showed that despite successes, the existing ML/DL-based vulnerability detection (VD) models are limited in the ability to distinguish between the two classes of vulnerability and benign code. We propose DeepVD, a graph-based neural network VD model that emphasizes on class-separation features between vulnerability and benign code. DeepVDleverages three types of class-separation features at different levels of abstraction: statement types (similar to Part-of-Speech tagging), Post-Dominator Tree (covering regular flows of execution), and Exception Flow Graph (covering the exception and error-handling flows). We conducted several experiments to evaluate DeepVD in a real-world vulnerability dataset of 303 projects with 13,130 vulnerable methods. Our results show that DeepVD relatively improves over the state-of-the-art ML/DL-based VD approaches 13%–29.6% in precision, 15.6%–28.9% in recall, and 16.4%–25.8% in F-score. Our ablation study confirms that our designed features and components help DeepVDachieve high class-separability for vulnerability and benign code.","PeriodicalId":376379,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","volume":"29 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE48619.2023.00189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The advances of machine learning (ML) including deep learning (DL) have enabled several approaches to implicitly learn vulnerable code patterns to automatically detect software vulnerabilities. A recent study showed that despite successes, the existing ML/DL-based vulnerability detection (VD) models are limited in the ability to distinguish between the two classes of vulnerability and benign code. We propose DeepVD, a graph-based neural network VD model that emphasizes on class-separation features between vulnerability and benign code. DeepVDleverages three types of class-separation features at different levels of abstraction: statement types (similar to Part-of-Speech tagging), Post-Dominator Tree (covering regular flows of execution), and Exception Flow Graph (covering the exception and error-handling flows). We conducted several experiments to evaluate DeepVD in a real-world vulnerability dataset of 303 projects with 13,130 vulnerable methods. Our results show that DeepVD relatively improves over the state-of-the-art ML/DL-based VD approaches 13%–29.6% in precision, 15.6%–28.9% in recall, and 16.4%–25.8% in F-score. Our ablation study confirms that our designed features and components help DeepVDachieve high class-separability for vulnerability and benign code.

查看原文本刊更多论文

DeepVD:面向类分离特征的神经网络漏洞检测

包括深度学习(DL)在内的机器学习(ML)的进步使几种方法能够隐式学习易受攻击的代码模式以自动检测软件漏洞。最近的一项研究表明，尽管取得了成功，但现有的基于ML/ dl的漏洞检测(VD)模型在区分两类漏洞和良性代码方面的能力有限。我们提出了一种基于图的神经网络VD模型DeepVD，它强调了漏洞和良性代码之间的类分离特征。deepv在不同的抽象层次上利用了三种类型的类分离特性:语句类型(类似于词性标记)、Post-Dominator树(覆盖常规的执行流)和异常流图(覆盖异常和错误处理流)。我们在303个项目的真实漏洞数据集中进行了多次实验，使用了13130种漏洞方法来评估DeepVD。我们的研究结果表明，与最先进的基于ML/ dl的VD相比，DeepVD的精度提高了13%-29.6%，召回率提高了15.6%-28.9%，F-score提高了16.4%-25.8%。我们的消蚀研究证实，我们设计的功能和组件帮助deepvd实现了漏洞和良性代码的高类可分离性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量