识别共现类代码气味的多标签学习

IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS
Mouna Hadj-Kacem, Nadia Bouassida
{"title":"识别共现类代码气味的多标签学习","authors":"Mouna Hadj-Kacem, Nadia Bouassida","doi":"10.1007/s00607-024-01294-x","DOIUrl":null,"url":null,"abstract":"<p>Code smell identification is crucial in software maintenance. The existing literature mostly focuses on single code smell identification. However, in practice, a software artefact typically exhibits multiple code smells simultaneously where their diffuseness has been assessed, suggesting that 59% of smelly classes are affected by more than one smell. So to meet this complexity found in real-world projects, we propose a multi-label learning-based approach to identify eight code smells at the class-level, i.e. the most sever software artefacts that need to be prioritized in the refactoring process. In our experiments, we have used 12 algorithms from different multi-label learning methods across 30 open-source Java projects, where significant findings have been presented. We have explored co-occurrences between class code smells and examined the impact of correlations on prediction results. Additionally, we assess multi-label learning methods to compare data adaptation versus algorithm adaptation. Our findings highlight the effectiveness of the Ensemble of Classifier Chains and Binary Relevance in achieving high-performance results.</p>","PeriodicalId":10718,"journal":{"name":"Computing","volume":"23 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-label learning for identifying co-occurring class code smells\",\"authors\":\"Mouna Hadj-Kacem, Nadia Bouassida\",\"doi\":\"10.1007/s00607-024-01294-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Code smell identification is crucial in software maintenance. The existing literature mostly focuses on single code smell identification. However, in practice, a software artefact typically exhibits multiple code smells simultaneously where their diffuseness has been assessed, suggesting that 59% of smelly classes are affected by more than one smell. So to meet this complexity found in real-world projects, we propose a multi-label learning-based approach to identify eight code smells at the class-level, i.e. the most sever software artefacts that need to be prioritized in the refactoring process. In our experiments, we have used 12 algorithms from different multi-label learning methods across 30 open-source Java projects, where significant findings have been presented. We have explored co-occurrences between class code smells and examined the impact of correlations on prediction results. Additionally, we assess multi-label learning methods to compare data adaptation versus algorithm adaptation. Our findings highlight the effectiveness of the Ensemble of Classifier Chains and Binary Relevance in achieving high-performance results.</p>\",\"PeriodicalId\":10718,\"journal\":{\"name\":\"Computing\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00607-024-01294-x\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00607-024-01294-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

代码气味识别对软件维护至关重要。现有文献大多侧重于单一代码气味的识别。然而,在实践中,软件工件通常会同时表现出多种代码气味,其扩散性已得到评估,表明 59% 的气味类受到不止一种气味的影响。因此,为了应对现实世界项目中的这种复杂性,我们提出了一种基于多标签学习的方法,用于识别类级的八种代码气味,即在重构过程中需要优先处理的最严重的软件构件。在实验中,我们在 30 个开源 Java 项目中使用了来自不同多标签学习方法的 12 种算法,并取得了重大发现。我们探索了类代码气味之间的共现关系,并研究了相关性对预测结果的影响。此外,我们还评估了多标签学习方法,以比较数据适应性与算法适应性。我们的研究结果凸显了分类器链组合和二元相关性在实现高性能结果方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Multi-label learning for identifying co-occurring class code smells

Multi-label learning for identifying co-occurring class code smells

Code smell identification is crucial in software maintenance. The existing literature mostly focuses on single code smell identification. However, in practice, a software artefact typically exhibits multiple code smells simultaneously where their diffuseness has been assessed, suggesting that 59% of smelly classes are affected by more than one smell. So to meet this complexity found in real-world projects, we propose a multi-label learning-based approach to identify eight code smells at the class-level, i.e. the most sever software artefacts that need to be prioritized in the refactoring process. In our experiments, we have used 12 algorithms from different multi-label learning methods across 30 open-source Java projects, where significant findings have been presented. We have explored co-occurrences between class code smells and examined the impact of correlations on prediction results. Additionally, we assess multi-label learning methods to compare data adaptation versus algorithm adaptation. Our findings highlight the effectiveness of the Ensemble of Classifier Chains and Binary Relevance in achieving high-performance results.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computing
Computing 工程技术-计算机:理论方法
CiteScore
8.20
自引率
2.70%
发文量
107
审稿时长
3 months
期刊介绍: Computing publishes original papers, short communications and surveys on all fields of computing. The contributions should be written in English and may be of theoretical or applied nature, the essential criteria are computational relevance and systematic foundation of results.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信