{"title":"基于树的细粒度代码变化挖掘以检测未知的变化模式","authors":"Yoshiki Higo, Junnosuke Matsumoto, S. Kusumoto","doi":"10.1109/APSEC53868.2021.00014","DOIUrl":null,"url":null,"abstract":"In software development, source code is repeatedly changed due to various reasons. Similar code changes are called change patterns. Identifying change patterns is useful to support software development in a variety of ways. For example, change patterns can be used to collect ingredients for code completion or automated program repair. Many research studies have proposed various techniques that detect change patterns. For example, Negara et al. proposed a technique that derives change patterns from the edit scripts. Negara's technique can detect fine-grained change patterns, but we consider that there is room to improve their technique. We found that Negara's technique occasionally generates change patterns from structurally-different changes, and we also uncovered that the reason why such change patterns are generated is that their technique performs text comparisons in matching changes. In this study, we propose a new change mining technique to detect change patterns only from structurally-identical changes by taking into account the structure of the abstract syntax trees. We implemented the proposed technique as a tool, TC2P, and we compared it with Negara's technique. As a result, we confirmed that TC2P was not only able to detect change patterns more adequately than the prior technique but also to detect change patterns that were not detected by the prior technique.","PeriodicalId":143800,"journal":{"name":"2021 28th Asia-Pacific Software Engineering Conference (APSEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tree-based Mining of Fine-grained Code Changes to Detect Unknown Change Patterns\",\"authors\":\"Yoshiki Higo, Junnosuke Matsumoto, S. Kusumoto\",\"doi\":\"10.1109/APSEC53868.2021.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In software development, source code is repeatedly changed due to various reasons. Similar code changes are called change patterns. Identifying change patterns is useful to support software development in a variety of ways. For example, change patterns can be used to collect ingredients for code completion or automated program repair. Many research studies have proposed various techniques that detect change patterns. For example, Negara et al. proposed a technique that derives change patterns from the edit scripts. Negara's technique can detect fine-grained change patterns, but we consider that there is room to improve their technique. We found that Negara's technique occasionally generates change patterns from structurally-different changes, and we also uncovered that the reason why such change patterns are generated is that their technique performs text comparisons in matching changes. In this study, we propose a new change mining technique to detect change patterns only from structurally-identical changes by taking into account the structure of the abstract syntax trees. We implemented the proposed technique as a tool, TC2P, and we compared it with Negara's technique. As a result, we confirmed that TC2P was not only able to detect change patterns more adequately than the prior technique but also to detect change patterns that were not detected by the prior technique.\",\"PeriodicalId\":143800,\"journal\":{\"name\":\"2021 28th Asia-Pacific Software Engineering Conference (APSEC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 28th Asia-Pacific Software Engineering Conference (APSEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSEC53868.2021.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 28th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC53868.2021.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tree-based Mining of Fine-grained Code Changes to Detect Unknown Change Patterns
In software development, source code is repeatedly changed due to various reasons. Similar code changes are called change patterns. Identifying change patterns is useful to support software development in a variety of ways. For example, change patterns can be used to collect ingredients for code completion or automated program repair. Many research studies have proposed various techniques that detect change patterns. For example, Negara et al. proposed a technique that derives change patterns from the edit scripts. Negara's technique can detect fine-grained change patterns, but we consider that there is room to improve their technique. We found that Negara's technique occasionally generates change patterns from structurally-different changes, and we also uncovered that the reason why such change patterns are generated is that their technique performs text comparisons in matching changes. In this study, we propose a new change mining technique to detect change patterns only from structurally-identical changes by taking into account the structure of the abstract syntax trees. We implemented the proposed technique as a tool, TC2P, and we compared it with Negara's technique. As a result, we confirmed that TC2P was not only able to detect change patterns more adequately than the prior technique but also to detect change patterns that were not detected by the prior technique.