Learning to Identify Security-Related Issues Using Convolutional Neural Networks

David N. Palacio, Daniel McCrystal, Kevin Moran, Carlos Bernal-Cárdenas, D. Poshyvanyk, Chris Shenefiel
{"title":"Learning to Identify Security-Related Issues Using Convolutional Neural Networks","authors":"David N. Palacio, Daniel McCrystal, Kevin Moran, Carlos Bernal-Cárdenas, D. Poshyvanyk, Chris Shenefiel","doi":"10.1109/ICSME.2019.00024","DOIUrl":null,"url":null,"abstract":"Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME.2019.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements.
学习使用卷积神经网络识别安全相关问题
软件安全正成为大公司和初创公司的首要任务,因为漏洞和破坏带来的危害越来越大。然而,在交付特性的同时获得健壮的安全保证需要在敏捷开发实践的上下文中进行不稳定的平衡。帮助开发团队保护其软件产品的一个途径是通过设计和开发以安全为重点的自动化。因此,我们提出了一种新的方法,称为SecureReqNet,用于自动识别软件问题跟踪系统中的问题是否描述了与安全相关的内容。我们的方法由一个两阶段的神经网络架构组成,该架构纯粹基于问题的自然语言描述。我们方法的第一阶段从CVE数据库中列出的数十万个漏洞描述和从开源项目中提取的问题描述中学习高维词嵌入。然后,第二阶段利用这些嵌入表示的语义本体来训练卷积神经网络,该网络能够预测给定问题是否与安全相关。我们通过将SecureReqNet应用于从GitLab和GitHub上的热门项目中挖掘的数千个问题的数据集中识别与安全相关的问题来评估它。此外,我们还应用我们的方法来识别由一家大型电信公司开发的商业软件项目的安全相关需求。我们的初步结果令人鼓舞,SecureReqNet在开源问题上达到了96%的准确率,在工业需求上达到了71.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信