使用代码更改编码来检查重构检测结果以提高准确性

2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM) Pub Date : 2022-10-01 DOI:10.1109/SCAM55253.2022.00016

Liang Tan, Christoph Bockisch

{"title":"使用代码更改编码来检查重构检测结果以提高准确性","authors":"Liang Tan, Christoph Bockisch","doi":"10.1109/SCAM55253.2022.00016","DOIUrl":null,"url":null,"abstract":"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.","PeriodicalId":138287,"journal":{"name":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy\",\"authors\":\"Liang Tan, Christoph Bockisch\",\"doi\":\"10.1109/SCAM55253.2022.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.\",\"PeriodicalId\":138287,\"journal\":{\"name\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM55253.2022.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM55253.2022.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

例如，在软件维护期间，了解代码更改的原因通常是很重要的，因此研究工具来自动检测由于重构而导致的更改。工具RefDiff可以支持多种编程语言实现这一点。它提供了良好的精度，但代价是由于在重构候选选择时必须使用高阈值，因此会产生大量假阴性结果。我们创建了一个结果检查器，通过包含更多的候选对象和减少之后的RefDiff检测结果的误报来提高RefDiff的整体性能。检查器对与结果相对应的文本差异(所谓的差异)进行编码，并使用机器学习来预测包含的重构类型。本文的主要贡献在于从检测结果中提取差值并将其编码为图像数据用于机器学习处理，以及机器学习算法的训练。我们已经证明，降低候选阈值与检查器相结合不仅提高了RefDiff的召回率，而且提高了精度。我们的方法将RefDiff检测结果提高到99.5%的准确率和95.2%的召回率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy

For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)

自引率

0.00%

发文量