使用代码更改编码来检查重构检测结果以提高准确性

Liang Tan, Christoph Bockisch
{"title":"使用代码更改编码来检查重构检测结果以提高准确性","authors":"Liang Tan, Christoph Bockisch","doi":"10.1109/SCAM55253.2022.00016","DOIUrl":null,"url":null,"abstract":"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.","PeriodicalId":138287,"journal":{"name":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy\",\"authors\":\"Liang Tan, Christoph Bockisch\",\"doi\":\"10.1109/SCAM55253.2022.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.\",\"PeriodicalId\":138287,\"journal\":{\"name\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM55253.2022.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM55253.2022.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

例如,在软件维护期间,了解代码更改的原因通常是很重要的,因此研究工具来自动检测由于重构而导致的更改。工具RefDiff可以支持多种编程语言实现这一点。它提供了良好的精度,但代价是由于在重构候选选择时必须使用高阈值,因此会产生大量假阴性结果。我们创建了一个结果检查器,通过包含更多的候选对象和减少之后的RefDiff检测结果的误报来提高RefDiff的整体性能。检查器对与结果相对应的文本差异(所谓的差异)进行编码,并使用机器学习来预测包含的重构类型。本文的主要贡献在于从检测结果中提取差值并将其编码为图像数据用于机器学习处理,以及机器学习算法的训练。我们已经证明,降低候选阈值与检查器相结合不仅提高了RefDiff的召回率,而且提高了精度。我们的方法将RefDiff检测结果提高到99.5%的准确率和95.2%的召回率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy
For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信