定位代码脆弱行的混合表示

IF 0.6 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Software Innovation Pub Date : 2022-01-01 DOI:10.4018/ijsi.292020

{"title":"定位代码脆弱行的混合表示","authors":"","doi":"10.4018/ijsi.292020","DOIUrl":null,"url":null,"abstract":"Locating vulnerable lines of code in large software systems needs huge efforts from human experts. This explains the high costs in terms of budget and time needed to correct vulnerabilities. To minimize these costs, automatic solutions of vulnerabilities prediction have been proposed. Existing machine learning (ML)-based solutions face difficulties in predicting vulnerabilities in coarse granularity and in defining suitable code features that limit their effectiveness. To addressee these limitations, in the present work, the authors propose an improved ML-based approach using slice-based code representation and the technique of TF-IDF to automatically extract effective features. The obtained results showed that combining these two techniques with ML techniques allows building effective vulnerability prediction models (VPMs) that locate vulnerabilities in a finer granularity and with excellent performances (high precision (>98%), low FNR (<2%) and low FPR (<3%) which outperforms software metrics and are equivalent to the best performing recent deep learning-based approaches.","PeriodicalId":55938,"journal":{"name":"International Journal of Software Innovation","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid Representation to Locate Vulnerable Lines of Code\",\"authors\":\"\",\"doi\":\"10.4018/ijsi.292020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Locating vulnerable lines of code in large software systems needs huge efforts from human experts. This explains the high costs in terms of budget and time needed to correct vulnerabilities. To minimize these costs, automatic solutions of vulnerabilities prediction have been proposed. Existing machine learning (ML)-based solutions face difficulties in predicting vulnerabilities in coarse granularity and in defining suitable code features that limit their effectiveness. To addressee these limitations, in the present work, the authors propose an improved ML-based approach using slice-based code representation and the technique of TF-IDF to automatically extract effective features. The obtained results showed that combining these two techniques with ML techniques allows building effective vulnerability prediction models (VPMs) that locate vulnerabilities in a finer granularity and with excellent performances (high precision (>98%), low FNR (<2%) and low FPR (<3%) which outperforms software metrics and are equivalent to the best performing recent deep learning-based approaches.\",\"PeriodicalId\":55938,\"journal\":{\"name\":\"International Journal of Software Innovation\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Software Innovation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijsi.292020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Software Innovation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijsi.292020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

在大型软件系统中定位易受攻击的代码行需要人类专家付出巨大努力。这就解释了纠正漏洞所需的预算和时间成本高昂的原因。为了最大限度地减少这些成本，已经提出了漏洞预测的自动解决方案。现有的基于机器学习（ML）的解决方案在预测粗粒度的漏洞和定义限制其有效性的合适代码特征方面面临困难。为了克服这些局限性，在本工作中，作者提出了一种改进的基于ML的方法，使用基于切片的代码表示和TF-IDF技术来自动提取有效特征。所获得的结果表明，将这两种技术与ML技术相结合可以建立有效的漏洞预测模型（VPM），低FNR（<2%）和低FPR（<3%），这优于软件指标，相当于最近性能最好的基于深度学习的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hybrid Representation to Locate Vulnerable Lines of Code

Locating vulnerable lines of code in large software systems needs huge efforts from human experts. This explains the high costs in terms of budget and time needed to correct vulnerabilities. To minimize these costs, automatic solutions of vulnerabilities prediction have been proposed. Existing machine learning (ML)-based solutions face difficulties in predicting vulnerabilities in coarse granularity and in defining suitable code features that limit their effectiveness. To addressee these limitations, in the present work, the authors propose an improved ML-based approach using slice-based code representation and the technique of TF-IDF to automatically extract effective features. The obtained results showed that combining these two techniques with ML techniques allows building effective vulnerability prediction models (VPMs) that locate vulnerabilities in a finer granularity and with excellent performances (high precision (>98%), low FNR (<2%) and low FPR (<3%) which outperforms software metrics and are equivalent to the best performing recent deep learning-based approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Software Innovation COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

1.40

自引率

0.00%

发文量

118

期刊介绍： The International Journal of Software Innovation (IJSI) covers state-of-the-art research and development in all aspects of evolutionary and revolutionary ideas pertaining to software systems and their development. The journal publishes original papers on both theory and practice that reflect and accommodate the fast-changing nature of daily life. Topics of interest include not only application-independent software systems, but also application-specific software systems like healthcare, education, energy, and entertainment software systems, as well as techniques and methodologies for modeling, developing, validating, maintaining, and reengineering software systems and their environments.