自动错误标记使用语义信息从LSI

2014 Seventh International Conference on Contemporary Computing (IC3) Pub Date : 2014-08-01 DOI:10.1109/IC3.2014.6897203

Indu Chawla, S. Singh

{"title":"自动错误标记使用语义信息从LSI","authors":"Indu Chawla, S. Singh","doi":"10.1109/IC3.2014.6897203","DOIUrl":null,"url":null,"abstract":"Most open source projects provide a defect tracking system, where users, developers, testers can directly report the problems. The fields provided in the bug report help triager and debugger to understand the problem better. They also help in other tasks like accurate assessment of priority and severity of bugs, identification of appropriate developer to resolve bugs etc. Label field in the bug report is one such field. It has been observed that in many bug repositories, the label field is either not present or is incorrectly assigned. There is a need for automatic bug labeling so that bug reports could be made more informative. This paper presents an automated technique for bug labeling using TF-IDF and LSI. Experimental study shows that there is improvement in results with the addition of semantically similar words obtained from LSI in conjunction with the terms extracted using TF-IDF. Using LSI along with TF-IDF, we achieved 61.5% accuracy for the polish bug reports and 62.8% accuracy for security bug reports as compared to 53.8% accuracy for polish and 61% for security bug reports from using TF-IDF alone.","PeriodicalId":444918,"journal":{"name":"2014 Seventh International Conference on Contemporary Computing (IC3)","volume":"272 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Automatic bug labeling using semantic information from LSI\",\"authors\":\"Indu Chawla, S. Singh\",\"doi\":\"10.1109/IC3.2014.6897203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most open source projects provide a defect tracking system, where users, developers, testers can directly report the problems. The fields provided in the bug report help triager and debugger to understand the problem better. They also help in other tasks like accurate assessment of priority and severity of bugs, identification of appropriate developer to resolve bugs etc. Label field in the bug report is one such field. It has been observed that in many bug repositories, the label field is either not present or is incorrectly assigned. There is a need for automatic bug labeling so that bug reports could be made more informative. This paper presents an automated technique for bug labeling using TF-IDF and LSI. Experimental study shows that there is improvement in results with the addition of semantically similar words obtained from LSI in conjunction with the terms extracted using TF-IDF. Using LSI along with TF-IDF, we achieved 61.5% accuracy for the polish bug reports and 62.8% accuracy for security bug reports as compared to 53.8% accuracy for polish and 61% for security bug reports from using TF-IDF alone.\",\"PeriodicalId\":444918,\"journal\":{\"name\":\"2014 Seventh International Conference on Contemporary Computing (IC3)\",\"volume\":\"272 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 Seventh International Conference on Contemporary Computing (IC3)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3.2014.6897203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Seventh International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2014.6897203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

大多数开源项目提供缺陷跟踪系统，用户、开发人员、测试人员可以直接报告问题。bug报告中提供的字段帮助触发器和调试器更好地理解问题。它们还有助于完成其他任务，如准确评估bug的优先级和严重程度，确定合适的开发人员来解决bug等。bug报告中的Label字段就是这样一个字段。我们观察到，在许多bug存储库中，标签字段要么不存在，要么被错误地分配。有必要对bug进行自动标记，以便bug报告能够提供更多信息。本文提出了一种利用TF-IDF和LSI进行错误自动标记的技术。实验研究表明，将从LSI中获得的语义相似的词与使用TF-IDF提取的术语相结合，可以改善结果。将LSI与TF-IDF一起使用，我们在抛光错误报告中实现了61.5%的准确率，在安全错误报告中实现了62.8%的准确率，而单独使用TF-IDF时，抛光错误报告的准确率为53.8%，安全错误报告的准确率为61%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic bug labeling using semantic information from LSI

Most open source projects provide a defect tracking system, where users, developers, testers can directly report the problems. The fields provided in the bug report help triager and debugger to understand the problem better. They also help in other tasks like accurate assessment of priority and severity of bugs, identification of appropriate developer to resolve bugs etc. Label field in the bug report is one such field. It has been observed that in many bug repositories, the label field is either not present or is incorrectly assigned. There is a need for automatic bug labeling so that bug reports could be made more informative. This paper presents an automated technique for bug labeling using TF-IDF and LSI. Experimental study shows that there is improvement in results with the addition of semantically similar words obtained from LSI in conjunction with the terms extracted using TF-IDF. Using LSI along with TF-IDF, we achieved 61.5% accuracy for the polish bug reports and 62.8% accuracy for security bug reports as compared to 53.8% accuracy for polish and 61% for security bug reports from using TF-IDF alone.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 Seventh International Conference on Contemporary Computing (IC3)

自引率

0.00%

发文量