HERMES: Using Commit-Issue Linking to Detect Vulnerability-Fixing Commits

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI:10.1109/saner53432.2022.00018

Giang Nguyen-Truong, Hong Jin Kang, D. Lo, Abhishek Sharma, A. Santosa, Asankhaya Sharma, Ming Yi Ang

{"title":"HERMES: Using Commit-Issue Linking to Detect Vulnerability-Fixing Commits","authors":"Giang Nguyen-Truong, Hong Jin Kang, D. Lo, Abhishek Sharma, A. Santosa, Asankhaya Sharma, Ming Yi Ang","doi":"10.1109/saner53432.2022.00018","DOIUrl":null,"url":null,"abstract":"Software projects today rely on many third-party libraries, and therefore, are exposed to vulnerabilities in these libraries. When a library vulnerability is fixed, users are notified and advised to upgrade to a new version of the library. However, not all vulnerabilities are publicly disclosed, and users may not be aware of vulnerabilities that may affect their applications. Due to the above challenges, there is a need for techniques which can identify and alert users to silent fixes in libraries; commits that fix bugs with security implications that are not officially disclosed. We propose a machine learning approach to automatically identify vulnerability-fixing commits. Existing techniques consider only data within a commit, such as its commit message, which does not always have sufficiently discriminative information. To address this limitation, our approach incorporates the rich source of information from issue trackers. When a commit does not link to an issue, we use a commit-issue link recovery technique to infer the potential missing link. Our experiments are promising; incorporating information from issue trackers boosts the performance of a vulnerability-fixing commit classifier, improving over the strongest baseline by 11.1% on the entire dataset, which includes commits that do not link to an issue. On a subset of the data in which all commits explicitly link to an issue, our approach improves over the baseline by 12.5%.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/saner53432.2022.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Software projects today rely on many third-party libraries, and therefore, are exposed to vulnerabilities in these libraries. When a library vulnerability is fixed, users are notified and advised to upgrade to a new version of the library. However, not all vulnerabilities are publicly disclosed, and users may not be aware of vulnerabilities that may affect their applications. Due to the above challenges, there is a need for techniques which can identify and alert users to silent fixes in libraries; commits that fix bugs with security implications that are not officially disclosed. We propose a machine learning approach to automatically identify vulnerability-fixing commits. Existing techniques consider only data within a commit, such as its commit message, which does not always have sufficiently discriminative information. To address this limitation, our approach incorporates the rich source of information from issue trackers. When a commit does not link to an issue, we use a commit-issue link recovery technique to infer the potential missing link. Our experiments are promising; incorporating information from issue trackers boosts the performance of a vulnerability-fixing commit classifier, improving over the strongest baseline by 11.1% on the entire dataset, which includes commits that do not link to an issue. On a subset of the data in which all commits explicitly link to an issue, our approach improves over the baseline by 12.5%.

查看原文本刊更多论文

HERMES:使用提交-问题链接来检测漏洞修复提交

今天的软件项目依赖于许多第三方库，因此暴露在这些库中的漏洞中。当库漏洞修复后，系统会通知用户升级到新版本的库。然而，并非所有的漏洞都是公开披露的，用户可能不知道可能影响其应用程序的漏洞。由于上述挑战，需要一种技术来识别并提醒用户注意库中的静默修复;修复带有未正式公开的安全含义的bug的提交。我们提出了一种机器学习方法来自动识别漏洞修复提交。现有的技术只考虑提交中的数据，比如提交消息，这并不总是有足够的区别信息。为了解决这一限制，我们的方法结合了来自问题跟踪器的丰富信息源。当提交没有链接到问题时，我们使用提交-问题链接恢复技术来推断潜在的缺失链接。我们的实验很有希望;整合来自问题跟踪器的信息可以提高漏洞修复提交分类器的性能，在整个数据集(包括不链接到问题的提交)上比最强基线提高11.1%。在所有提交都明确链接到一个问题的数据子集上，我们的方法比基线提高了12.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)

自引率

0.00%

发文量