Olaf Leßenich, S. Apel, Christian Kästner, Georg Seibt, J. Siegmund
{"title":"结构化合并中的重命名和移位代码:展望精度和性能","authors":"Olaf Leßenich, S. Apel, Christian Kästner, Georg Seibt, J. Siegmund","doi":"10.1109/ASE.2017.8115665","DOIUrl":null,"url":null,"abstract":"Diffing and merging of source-code artifacts is an essential task when integrating changes in software versions. While state-of-the-art line-based merge tools (e.g., git merge) are fast and independent of the programming language used, they have only a low precision. Recently, it has been shown that the precision of merging can be substantially improved by using a language-aware, structured approach that works on abstract syntax trees. But, precise structured merging is NP hard, especially, when considering the notoriously difficult scenarios of renamings and shifted code. To address these scenarios without compromising scalability, we propose a syntax-aware, heuristic optimization for structured merging that employs a lookahead mechanism during tree matching. The key idea is that renamings and shifted code are not arbitrarily distributed, but their occurrence follows patterns, which we address with a syntax-specific lookahead. Our experiments with 48 real-world open-source projects (4,878 merge scenarios with over 400 million lines of code) demonstrate that we can significantly improve matching precision in 28 percent of cases while maintaining performance.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Renaming and shifted code in structured merging: Looking ahead for precision and performance\",\"authors\":\"Olaf Leßenich, S. Apel, Christian Kästner, Georg Seibt, J. Siegmund\",\"doi\":\"10.1109/ASE.2017.8115665\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diffing and merging of source-code artifacts is an essential task when integrating changes in software versions. While state-of-the-art line-based merge tools (e.g., git merge) are fast and independent of the programming language used, they have only a low precision. Recently, it has been shown that the precision of merging can be substantially improved by using a language-aware, structured approach that works on abstract syntax trees. But, precise structured merging is NP hard, especially, when considering the notoriously difficult scenarios of renamings and shifted code. To address these scenarios without compromising scalability, we propose a syntax-aware, heuristic optimization for structured merging that employs a lookahead mechanism during tree matching. The key idea is that renamings and shifted code are not arbitrarily distributed, but their occurrence follows patterns, which we address with a syntax-specific lookahead. Our experiments with 48 real-world open-source projects (4,878 merge scenarios with over 400 million lines of code) demonstrate that we can significantly improve matching precision in 28 percent of cases while maintaining performance.\",\"PeriodicalId\":382876,\"journal\":{\"name\":\"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASE.2017.8115665\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASE.2017.8115665","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Renaming and shifted code in structured merging: Looking ahead for precision and performance
Diffing and merging of source-code artifacts is an essential task when integrating changes in software versions. While state-of-the-art line-based merge tools (e.g., git merge) are fast and independent of the programming language used, they have only a low precision. Recently, it has been shown that the precision of merging can be substantially improved by using a language-aware, structured approach that works on abstract syntax trees. But, precise structured merging is NP hard, especially, when considering the notoriously difficult scenarios of renamings and shifted code. To address these scenarios without compromising scalability, we propose a syntax-aware, heuristic optimization for structured merging that employs a lookahead mechanism during tree matching. The key idea is that renamings and shifted code are not arbitrarily distributed, but their occurrence follows patterns, which we address with a syntax-specific lookahead. Our experiments with 48 real-world open-source projects (4,878 merge scenarios with over 400 million lines of code) demonstrate that we can significantly improve matching precision in 28 percent of cases while maintaining performance.