基于图神经网络的高效特征嫉妒检测与重构

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2024-12-05 DOI:10.1007/s10515-024-00476-3

Dongjin Yu, Yihang Xu, Lehui Weng, Jie Chen, Xin Chen, Quanxin Yang

{"title":"基于图神经网络的高效特征嫉妒检测与重构","authors":"Dongjin Yu, Yihang Xu, Lehui Weng, Jie Chen, Xin Chen, Quanxin Yang","doi":"10.1007/s10515-024-00476-3","DOIUrl":null,"url":null,"abstract":"<div>As one type of frequently occurring code smells, feature envy negatively affects class cohesion, increases coupling between classes, and thus hampers software maintainability. While progress has been made in feature envy detection, two challenges still persist. Firstly, existing approaches often underutilize method call relationships, resulting in suboptimal detection efficiency. Secondly, they lack the emphasis on feature envy refactoring, which is however the ultimate goal of feature envy detection. To address these challenges, we propose two approaches: SCG (SMOTE Call Graph) and SFFL (Symmetric Feature Fusion Learning). SCG transforms the feature envy detection problem into a binary classification task on a method call graph. It predicts the weights of edges, termed calling strength, to capture the strength of method invocations. Additionally, it converts the method-method call graph into a method-class call graph and recommends the smelly method to the external class with the highest calling strength. As a holistic approach focusing on refactoring feature envy directly, SFFL leverages four heterogeneous graphs to represent method-class relationships. Through Symmetric Feature Fusion Learning, it obtains representations for methods and classes. Link prediction is then employed to generate the refactored method-class ownership graph, which is regarded as the refactored results. Moreover, to address the limitations of existing metrics in accurately evaluating refactoring performance, we introduce three new metrics: \\(\\textit{precision}_2\\), \\(\\textit{recall}_2\\) and \\(\\textit{F}_1\\text {-score}_2\\). Extensive experiments on five open-source projects demonstrate the superiority of SCG and SFFL. The code and dataset used in our study are available at https://github.com/HduDBSI/SCG-SFFL.</div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient feature envy detection and refactoring based on graph neural network\",\"authors\":\"Dongjin Yu, Yihang Xu, Lehui Weng, Jie Chen, Xin Chen, Quanxin Yang\",\"doi\":\"10.1007/s10515-024-00476-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>As one type of frequently occurring code smells, feature envy negatively affects class cohesion, increases coupling between classes, and thus hampers software maintainability. While progress has been made in feature envy detection, two challenges still persist. Firstly, existing approaches often underutilize method call relationships, resulting in suboptimal detection efficiency. Secondly, they lack the emphasis on feature envy refactoring, which is however the ultimate goal of feature envy detection. To address these challenges, we propose two approaches: SCG (SMOTE Call Graph) and SFFL (Symmetric Feature Fusion Learning). SCG transforms the feature envy detection problem into a binary classification task on a method call graph. It predicts the weights of edges, termed calling strength, to capture the strength of method invocations. Additionally, it converts the method-method call graph into a method-class call graph and recommends the smelly method to the external class with the highest calling strength. As a holistic approach focusing on refactoring feature envy directly, SFFL leverages four heterogeneous graphs to represent method-class relationships. Through Symmetric Feature Fusion Learning, it obtains representations for methods and classes. Link prediction is then employed to generate the refactored method-class ownership graph, which is regarded as the refactored results. Moreover, to address the limitations of existing metrics in accurately evaluating refactoring performance, we introduce three new metrics: \\\\(\\\\textit{precision}_2\\\\), \\\\(\\\\textit{recall}_2\\\\) and \\\\(\\\\textit{F}_1\\\\text {-score}_2\\\\). Extensive experiments on five open-source projects demonstrate the superiority of SCG and SFFL. The code and dataset used in our study are available at https://github.com/HduDBSI/SCG-SFFL.</div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-024-00476-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-024-00476-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

作为一种频繁出现的代码，特性嫉妒会对类内聚产生负面影响，增加类之间的耦合，从而阻碍软件的可维护性。虽然在特征嫉妒检测方面取得了进展，但仍然存在两个挑战。首先，现有方法往往没有充分利用方法调用关系，导致检测效率不理想。其次，它们缺乏对特征嫉妒重构的重视，而特征嫉妒重构正是特征嫉妒检测的最终目标。为了解决这些挑战，我们提出了两种方法：SCG （SMOTE调用图）和SFFL（对称特征融合学习）。SCG将特征羡慕检测问题转化为基于方法调用图的二值分类任务。它预测边缘的权重，称为调用强度，以捕获方法调用的强度。此外，它将方法-方法调用图转换为方法-类调用图，并将有问题的方法推荐给调用强度最高的外部类。作为一种专注于直接重构特性的整体方法，SFFL利用四个异构图来表示方法-类关系。通过对称特征融合学习，获得方法和类的表示。然后使用链接预测来生成重构的方法类所有权图，该图被视为重构的结果。此外，为了解决现有指标在准确评估重构性能方面的局限性，我们引入了三个新指标：\(\textit{precision}_2\)、\(\textit{recall}_2\)和\(\textit{F}_1\text {-score}_2\)。在五个开源项目上的大量实验证明了SCG和SFFL的优越性。我们研究中使用的代码和数据集可在https://github.com/HduDBSI/SCG-SFFL上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient feature envy detection and refactoring based on graph neural network

As one type of frequently occurring code smells, feature envy negatively affects class cohesion, increases coupling between classes, and thus hampers software maintainability. While progress has been made in feature envy detection, two challenges still persist. Firstly, existing approaches often underutilize method call relationships, resulting in suboptimal detection efficiency. Secondly, they lack the emphasis on feature envy refactoring, which is however the ultimate goal of feature envy detection. To address these challenges, we propose two approaches: SCG (SMOTE Call Graph) and SFFL (Symmetric Feature Fusion Learning). SCG transforms the feature envy detection problem into a binary classification task on a method call graph. It predicts the weights of edges, termed calling strength, to capture the strength of method invocations. Additionally, it converts the method-method call graph into a method-class call graph and recommends the smelly method to the external class with the highest calling strength. As a holistic approach focusing on refactoring feature envy directly, SFFL leverages four heterogeneous graphs to represent method-class relationships. Through Symmetric Feature Fusion Learning, it obtains representations for methods and classes. Link prediction is then employed to generate the refactored method-class ownership graph, which is regarded as the refactored results. Moreover, to address the limitations of existing metrics in accurately evaluating refactoring performance, we introduce three new metrics: \(\textit{precision}_2\), \(\textit{recall}_2\) and \(\textit{F}_1\text {-score}_2\). Extensive experiments on five open-source projects demonstrate the superiority of SCG and SFFL. The code and dataset used in our study are available at https://github.com/HduDBSI/SCG-SFFL.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.