主动改进克隆异常报告

2012 34th International Conference on Software Engineering (ICSE) Pub Date : 2012-06-02 DOI:10.1109/ICSE.2012.6227175

Lucia, D. Lo, Lingxiao Jiang, Aditya Budi

{"title":"主动改进克隆异常报告","authors":"Lucia, D. Lo, Lingxiao Jiang, Aditya Budi","doi":"10.1109/ICSE.2012.6227175","DOIUrl":null,"url":null,"abstract":"Software clones have been widely studied in the recent literature and shown useful for finding bugs because inconsistent changes among clones in a clone group may indicate potential bugs. However, many inconsistent clone groups are not real bugs (true positives). The excessive number of false positives could easily impede broad adoption of clone-based bug detection approaches. In this work, we aim to improve the usability of clone-based bug detection tools by increasing the rate of true positives found when a developer analyzes anomaly reports. Our idea is to control the number of anomaly reports a user can see at a time and actively incorporate incremental user feedback to continually refine the anomaly reports. Our system first presents top few anomaly reports from the list of reports generated by a tool in its default ordering. Users then either accept or reject each of the reports. Based on the feedback, our system automatically and iteratively refines a classification model for anomalies and re-sorts the rest of the reports. Our goal is to present the true positives to the users earlier than the default ordering. The rationale of the idea is based on our observation that false positives among the inconsistent clone groups could share common features (in terms of code structure, programming patterns, etc.), and these features can be learned from the incremental user feedback. We evaluate our refinement process on three sets of clone-based anomaly reports from three large real programs: the Linux Kernel (C), Eclipse, and ArgoUML (Java), extracted by a clone-based anomaly detection tool. The results show that compared to the original ordering of bug reports, we can improve the rate of true positives found (i.e., true positives are found faster) by 11%, 87%, and 86% for Linux kernel, Eclipse, and ArgoUML, respectively.","PeriodicalId":420187,"journal":{"name":"2012 34th International Conference on Software Engineering (ICSE)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Active refinement of clone anomaly reports\",\"authors\":\"Lucia, D. Lo, Lingxiao Jiang, Aditya Budi\",\"doi\":\"10.1109/ICSE.2012.6227175\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software clones have been widely studied in the recent literature and shown useful for finding bugs because inconsistent changes among clones in a clone group may indicate potential bugs. However, many inconsistent clone groups are not real bugs (true positives). The excessive number of false positives could easily impede broad adoption of clone-based bug detection approaches. In this work, we aim to improve the usability of clone-based bug detection tools by increasing the rate of true positives found when a developer analyzes anomaly reports. Our idea is to control the number of anomaly reports a user can see at a time and actively incorporate incremental user feedback to continually refine the anomaly reports. Our system first presents top few anomaly reports from the list of reports generated by a tool in its default ordering. Users then either accept or reject each of the reports. Based on the feedback, our system automatically and iteratively refines a classification model for anomalies and re-sorts the rest of the reports. Our goal is to present the true positives to the users earlier than the default ordering. The rationale of the idea is based on our observation that false positives among the inconsistent clone groups could share common features (in terms of code structure, programming patterns, etc.), and these features can be learned from the incremental user feedback. We evaluate our refinement process on three sets of clone-based anomaly reports from three large real programs: the Linux Kernel (C), Eclipse, and ArgoUML (Java), extracted by a clone-based anomaly detection tool. The results show that compared to the original ordering of bug reports, we can improve the rate of true positives found (i.e., true positives are found faster) by 11%, 87%, and 86% for Linux kernel, Eclipse, and ArgoUML, respectively.\",\"PeriodicalId\":420187,\"journal\":{\"name\":\"2012 34th International Conference on Software Engineering (ICSE)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 34th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2012.6227175\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 34th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2012.6227175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

摘要

软件克隆在最近的文献中得到了广泛的研究，并被证明对发现错误很有用，因为克隆组中克隆之间不一致的变化可能表明潜在的错误。然而，许多不一致的克隆组并不是真正的bug(真正的阳性)。过多的误报很容易阻碍基于克隆的错误检测方法的广泛采用。在这项工作中，我们的目标是通过增加开发人员分析异常报告时发现的真阳性率来提高基于克隆的错误检测工具的可用性。我们的想法是控制用户一次可以看到的异常报告的数量，并积极地合并增量用户反馈，以不断地改进异常报告。我们的系统首先呈现由工具以默认顺序生成的报告列表中的前几个异常报告。然后用户接受或拒绝每个报告。基于反馈，我们的系统自动迭代地细化异常的分类模型，并对报告的其余部分重新排序。我们的目标是在默认排序之前向用户展示真正的正面信息。这个想法的基本原理是基于我们的观察，即不一致的克隆组之间的误报可能具有共同的特征(在代码结构、编程模式等方面)，并且这些特征可以从增量用户反馈中学习。我们通过基于克隆的异常检测工具提取了来自三个大型真实程序:Linux Kernel (C)、Eclipse和ArgoUML (Java)的三组基于克隆的异常报告来评估我们的改进过程。结果表明，与原来的bug报告顺序相比，我们可以将Linux内核、Eclipse和ArgoUML的真阳性发现率(即更快地发现真阳性)分别提高11%、87%和86%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Active refinement of clone anomaly reports

Software clones have been widely studied in the recent literature and shown useful for finding bugs because inconsistent changes among clones in a clone group may indicate potential bugs. However, many inconsistent clone groups are not real bugs (true positives). The excessive number of false positives could easily impede broad adoption of clone-based bug detection approaches. In this work, we aim to improve the usability of clone-based bug detection tools by increasing the rate of true positives found when a developer analyzes anomaly reports. Our idea is to control the number of anomaly reports a user can see at a time and actively incorporate incremental user feedback to continually refine the anomaly reports. Our system first presents top few anomaly reports from the list of reports generated by a tool in its default ordering. Users then either accept or reject each of the reports. Based on the feedback, our system automatically and iteratively refines a classification model for anomalies and re-sorts the rest of the reports. Our goal is to present the true positives to the users earlier than the default ordering. The rationale of the idea is based on our observation that false positives among the inconsistent clone groups could share common features (in terms of code structure, programming patterns, etc.), and these features can be learned from the incremental user feedback. We evaluate our refinement process on three sets of clone-based anomaly reports from three large real programs: the Linux Kernel (C), Eclipse, and ArgoUML (Java), extracted by a clone-based anomaly detection tool. The results show that compared to the original ordering of bug reports, we can improve the rate of true positives found (i.e., true positives are found faster) by 11%, 87%, and 86% for Linux kernel, Eclipse, and ArgoUML, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 34th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量