Dicos: Discovering Insecure Code Snippets from Stack Overflow Posts by Leveraging User Discussions

Annual Computer Security Applications Conference Pub Date : 2021-12-06 DOI:10.1145/3485832.3488026

Hyunji Hong, Seunghoon Woo, Heejo Lee

{"title":"Dicos: Discovering Insecure Code Snippets from Stack Overflow Posts by Leveraging User Discussions","authors":"Hyunji Hong, Seunghoon Woo, Heejo Lee","doi":"10.1145/3485832.3488026","DOIUrl":null,"url":null,"abstract":"Online Q&A fora such as Stack Overflow assist developers to solve their faced coding problems. Despite the advantages, Stack Overflow has the potential to provide insecure code snippets that, if reused, can compromise the security of the entire software. We present Dicos, an accurate approach by examining the change history of Stack Overflow posts for discovering insecure code snippets. When a security issue was detected in a post, the insecure code is fixed to be safe through user discussions, leaving a change history. Inspired by this process, Dicos first extracts the change history from the Stack Overflow post, and then analyzes the history whether it contains security patches, by utilizing pre-selected features that can effectively identify security patches. Finally, when such changes are detected, Dicos determines that the code snippet before applying the security patch is insecure. To evaluate Dicos, we collected 1,958,283 Stack Overflow posts tagged with C, C++, and Android. When we applied Dicos on the collected posts, Dicos discovered 12,458 insecure posts (i.e., 14,719 insecure code snippets) from the collected posts with 91% precision and 93% recall. We further confirmed that the latest versions of 151 out of 2,000 popular C/C++ open-source software contain at least one insecure code snippet taken from Stack Overflow, being discovered by Dicos. Our proposed approach, Dicos, can contribute to preventing further propagation of insecure codes and thus creating a safe code reuse environment.","PeriodicalId":175869,"journal":{"name":"Annual Computer Security Applications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Computer Security Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3485832.3488026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Online Q&A fora such as Stack Overflow assist developers to solve their faced coding problems. Despite the advantages, Stack Overflow has the potential to provide insecure code snippets that, if reused, can compromise the security of the entire software. We present Dicos, an accurate approach by examining the change history of Stack Overflow posts for discovering insecure code snippets. When a security issue was detected in a post, the insecure code is fixed to be safe through user discussions, leaving a change history. Inspired by this process, Dicos first extracts the change history from the Stack Overflow post, and then analyzes the history whether it contains security patches, by utilizing pre-selected features that can effectively identify security patches. Finally, when such changes are detected, Dicos determines that the code snippet before applying the security patch is insecure. To evaluate Dicos, we collected 1,958,283 Stack Overflow posts tagged with C, C++, and Android. When we applied Dicos on the collected posts, Dicos discovered 12,458 insecure posts (i.e., 14,719 insecure code snippets) from the collected posts with 91% precision and 93% recall. We further confirmed that the latest versions of 151 out of 2,000 popular C/C++ open-source software contain at least one insecure code snippet taken from Stack Overflow, being discovered by Dicos. Our proposed approach, Dicos, can contribute to preventing further propagation of insecure codes and thus creating a safe code reuse environment.

查看原文本刊更多论文

利用用户讨论从堆栈溢出帖子中发现不安全的代码片段

Stack Overflow等在线问答论坛帮助开发人员解决他们面临的编码问题。尽管有这些优点，Stack Overflow也有可能提供不安全的代码片段，如果重用这些代码片段，可能会危及整个软件的安全性。我们介绍Dicos，一种通过检查Stack Overflow帖子的变更历史来发现不安全代码片段的准确方法。当在帖子中检测到安全问题时，通过用户讨论将不安全的代码修复为安全代码，并留下更改历史记录。受此过程的启发，Dicos首先从Stack Overflow帖子中提取变更历史，然后利用预先选择的可以有效识别安全补丁的特性，分析历史中是否包含安全补丁。最后，当检测到此类更改时，Dicos确定应用安全补丁之前的代码片段是不安全的。为了评估Dicos，我们收集了1,958,283篇以C、c++和Android为标签的Stack Overflow帖子。当我们将Dicos应用于收集的帖子时，Dicos从收集的帖子中发现了12,458个不安全的帖子(即14,719个不安全的代码片段)，准确率为91%，召回率为93%。我们进一步确认，在2000个流行的C/ c++开源软件的最新版本中，有151个包含至少一个来自Stack Overflow的不安全代码片段，这是由Dicos发现的。我们提出的方法Dicos有助于防止不安全代码的进一步传播，从而创建一个安全的代码重用环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Computer Security Applications Conference

自引率

0.00%

发文量