Fei Zuo, Xin Zhang, Yuqi Song, J. Rhee, Jicheng Fu
{"title":"提交消息可以帮助:安全补丁检测在开放源码软件通过变压器","authors":"Fei Zuo, Xin Zhang, Yuqi Song, J. Rhee, Jicheng Fu","doi":"10.1109/SERA57763.2023.10197730","DOIUrl":null,"url":null,"abstract":"As open source software is widely used, the vulnerabilities contained therein are also rapidly propagated to a large number of innocent applications. Even worse, many vulnerabilities in open-source projects are secretly fixed, which leads to affected software being unaware and thus exposed to risks. For the purpose of protecting deployed software, designing an effective patch classification system becomes more of a need than an option. To this end, some researchers take advantage of the recent advancements in natural language processing to learn both commit messages and code changes. However, they often incur high false positive rates. Not only that, existing works cannot yet answer how much the textual description (such as commit messages) alone can influence the final triage. In this paper, we propose a Transformer based patch classifier, which does not use any code changes as inputs. Surprisingly, the extensive experiment shows the proposed approach can significantly outperform other state-of-the-art work with a high precision of 93.0% and low false positive rate. Therefore, our research further confirms the critical importance of well-crafted commit messages for the later software maintenance. Finally, our case study also identifies 48 silent security patches, which can benefit those affected software.","PeriodicalId":211080,"journal":{"name":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Commit Message Can Help: Security Patch Detection in Open Source Software via Transformer\",\"authors\":\"Fei Zuo, Xin Zhang, Yuqi Song, J. Rhee, Jicheng Fu\",\"doi\":\"10.1109/SERA57763.2023.10197730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As open source software is widely used, the vulnerabilities contained therein are also rapidly propagated to a large number of innocent applications. Even worse, many vulnerabilities in open-source projects are secretly fixed, which leads to affected software being unaware and thus exposed to risks. For the purpose of protecting deployed software, designing an effective patch classification system becomes more of a need than an option. To this end, some researchers take advantage of the recent advancements in natural language processing to learn both commit messages and code changes. However, they often incur high false positive rates. Not only that, existing works cannot yet answer how much the textual description (such as commit messages) alone can influence the final triage. In this paper, we propose a Transformer based patch classifier, which does not use any code changes as inputs. Surprisingly, the extensive experiment shows the proposed approach can significantly outperform other state-of-the-art work with a high precision of 93.0% and low false positive rate. Therefore, our research further confirms the critical importance of well-crafted commit messages for the later software maintenance. Finally, our case study also identifies 48 silent security patches, which can benefit those affected software.\",\"PeriodicalId\":211080,\"journal\":{\"name\":\"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SERA57763.2023.10197730\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA57763.2023.10197730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Commit Message Can Help: Security Patch Detection in Open Source Software via Transformer
As open source software is widely used, the vulnerabilities contained therein are also rapidly propagated to a large number of innocent applications. Even worse, many vulnerabilities in open-source projects are secretly fixed, which leads to affected software being unaware and thus exposed to risks. For the purpose of protecting deployed software, designing an effective patch classification system becomes more of a need than an option. To this end, some researchers take advantage of the recent advancements in natural language processing to learn both commit messages and code changes. However, they often incur high false positive rates. Not only that, existing works cannot yet answer how much the textual description (such as commit messages) alone can influence the final triage. In this paper, we propose a Transformer based patch classifier, which does not use any code changes as inputs. Surprisingly, the extensive experiment shows the proposed approach can significantly outperform other state-of-the-art work with a high precision of 93.0% and low false positive rate. Therefore, our research further confirms the critical importance of well-crafted commit messages for the later software maintenance. Finally, our case study also identifies 48 silent security patches, which can benefit those affected software.