基于度量的代码气味检测的启发式方法和机器学习方法的比较

Fabiano Pecorelli, Fabio Palomba, D. D. Nucci, A. D. Lucia
{"title":"基于度量的代码气味检测的启发式方法和机器学习方法的比较","authors":"Fabiano Pecorelli, Fabio Palomba, D. D. Nucci, A. D. Lucia","doi":"10.1109/ICPC.2019.00023","DOIUrl":null,"url":null,"abstract":"Code smells represent poor implementation choices performed by developers when enhancing source code. Their negative impact on source code maintainability and comprehensibility has been widely shown in the past and several techniques to automatically detect them have been devised. Most of these techniques are based on heuristics, namely they compute a set of code metrics and combine them by creating detection rules; while they have a reasonable accuracy, a recent trend is represented by the use of machine learning where code metrics are used as predictors of the smelliness of code artefacts. Despite the recent advances in the field, there is still a noticeable lack of knowledge of whether machine learning can actually be more accurate than traditional heuristic-based approaches. To fill this gap, in this paper we propose a large-scale study to empirically compare the performance of heuristic-based and machine-learning-based techniques for metric-based code smell detection. We consider five code smell types and compare machine learning models with DECOR, a state-of-the-art heuristic-based approach. Key findings emphasize the need of further research aimed at improving the effectiveness of both machine learning and heuristic approaches for code smell detection: while DECOR generally achieves better performance than a machine learning baseline, its precision is still too low to make it usable in practice.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"99 1","pages":"93-104"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":"{\"title\":\"Comparing Heuristic and Machine Learning Approaches for Metric-Based Code Smell Detection\",\"authors\":\"Fabiano Pecorelli, Fabio Palomba, D. D. Nucci, A. D. Lucia\",\"doi\":\"10.1109/ICPC.2019.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code smells represent poor implementation choices performed by developers when enhancing source code. Their negative impact on source code maintainability and comprehensibility has been widely shown in the past and several techniques to automatically detect them have been devised. Most of these techniques are based on heuristics, namely they compute a set of code metrics and combine them by creating detection rules; while they have a reasonable accuracy, a recent trend is represented by the use of machine learning where code metrics are used as predictors of the smelliness of code artefacts. Despite the recent advances in the field, there is still a noticeable lack of knowledge of whether machine learning can actually be more accurate than traditional heuristic-based approaches. To fill this gap, in this paper we propose a large-scale study to empirically compare the performance of heuristic-based and machine-learning-based techniques for metric-based code smell detection. We consider five code smell types and compare machine learning models with DECOR, a state-of-the-art heuristic-based approach. Key findings emphasize the need of further research aimed at improving the effectiveness of both machine learning and heuristic approaches for code smell detection: while DECOR generally achieves better performance than a machine learning baseline, its precision is still too low to make it usable in practice.\",\"PeriodicalId\":6853,\"journal\":{\"name\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"volume\":\"99 1\",\"pages\":\"93-104\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"65\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPC.2019.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC.2019.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 65

摘要

代码气味表示开发人员在增强源代码时执行的不良实现选择。它们对源代码的可维护性和可理解性的负面影响在过去已经被广泛地展示出来,并且已经设计了一些自动检测它们的技术。这些技术大多基于启发式,即它们计算一组代码度量,并通过创建检测规则将它们组合起来;虽然它们具有合理的准确性,但最近的趋势是机器学习的使用,其中代码度量被用作代码工件气味的预测因子。尽管该领域最近取得了进展,但人们仍然明显缺乏关于机器学习是否真的比传统的启发式方法更准确的知识。为了填补这一空白,在本文中,我们提出了一项大规模的研究,以经验比较基于启发式和基于机器学习的基于度量的代码气味检测技术的性能。我们考虑了五种代码气味类型,并将机器学习模型与DECOR(一种最先进的基于启发式的方法)进行比较。关键发现强调了进一步研究的必要性,旨在提高机器学习和启发式方法在代码气味检测方面的有效性:虽然DECOR通常比机器学习基线实现更好的性能,但其精度仍然太低,无法在实践中使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparing Heuristic and Machine Learning Approaches for Metric-Based Code Smell Detection
Code smells represent poor implementation choices performed by developers when enhancing source code. Their negative impact on source code maintainability and comprehensibility has been widely shown in the past and several techniques to automatically detect them have been devised. Most of these techniques are based on heuristics, namely they compute a set of code metrics and combine them by creating detection rules; while they have a reasonable accuracy, a recent trend is represented by the use of machine learning where code metrics are used as predictors of the smelliness of code artefacts. Despite the recent advances in the field, there is still a noticeable lack of knowledge of whether machine learning can actually be more accurate than traditional heuristic-based approaches. To fill this gap, in this paper we propose a large-scale study to empirically compare the performance of heuristic-based and machine-learning-based techniques for metric-based code smell detection. We consider five code smell types and compare machine learning models with DECOR, a state-of-the-art heuristic-based approach. Key findings emphasize the need of further research aimed at improving the effectiveness of both machine learning and heuristic approaches for code smell detection: while DECOR generally achieves better performance than a machine learning baseline, its precision is still too low to make it usable in practice.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信