{"title":"使用代码更改编码来检查重构检测结果以提高准确性","authors":"Liang Tan, Christoph Bockisch","doi":"10.1109/SCAM55253.2022.00016","DOIUrl":null,"url":null,"abstract":"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.","PeriodicalId":138287,"journal":{"name":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy\",\"authors\":\"Liang Tan, Christoph Bockisch\",\"doi\":\"10.1109/SCAM55253.2022.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.\",\"PeriodicalId\":138287,\"journal\":{\"name\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM55253.2022.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM55253.2022.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Checking Refactoring Detection Results Using Code Changes Encoding for Improved Accuracy
For example during software maintenance, it is often important to know the reason for a code change and therefore tools are researched to automatically detect changes due to refactorings. The tool RefDiff can achieve this supporting multiple programming languages. It provides a good precision, but at the cost of a large number of false negative results due to the necessary use of a high threshold in refactoring candidate selection. We have created a result checker that improves the overall performance of RefDiff by including more candidates and reducing false positives from RefDiff detection results afterwards. The checker encodes the textual differences (so-called diffs) corresponding to the results and uses machine learning to predict the contained refactoring type. The main contribution of this paper is the approach for extracting the diffs from the detection results and encoding them as image data for machine learning processing, as well as the training of the machine learning algorithm. We have shown that lowering the candidate threshold in conjunction with the checker improves not only the recall of RefDiff, also the precision is increased. Our approach improves the RefDiff detection results to 99.5% precision and 95.2% recall.