简单还是复杂?一起获得更准确的及时缺陷预测器

Xin Zhou, Donggyun Han, David Lo
{"title":"简单还是复杂?一起获得更准确的及时缺陷预测器","authors":"Xin Zhou, Donggyun Han, David Lo","doi":"10.1145/3524610.3527910","DOIUrl":null,"url":null,"abstract":"Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Simple or Complex? Together for a More Accurate Just-In-Time Defect Predictor\",\"authors\":\"Xin Zhou, Donggyun Han, David Lo\",\"doi\":\"10.1145/3524610.3527910\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.\",\"PeriodicalId\":426634,\"journal\":{\"name\":\"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3524610.3527910\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527910","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

即时缺陷预测(JIT)旨在自动预测提交是否存在缺陷,近年来得到了广泛的研究。一般来说,大多数研究可以分为两类:1)使用传统机器学习分类器和手工制作特征的简单模型,以及2)使用深度学习技术自动提取特征的复杂模型。简单模型使用的手工特征基于专家知识,但可能不能完全表示提交的语义含义。另一方面,复杂模型使用的基于深度学习的特征表示提交的语义含义,但可能无法反映有用的专家知识。简单模型和复杂模型似乎在某种程度上是互补的。为了利用简单模型和复杂模型的优点,我们提出了一种将一个简单模型和一个复杂模型的预测分数融合的组合模型SimCom。实验结果表明,我们的方法可以显著优于目前最先进的6.0-18.1%。此外,我们的实验结果证实了简单模型和复杂模型是互补的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Simple or Complex? Together for a More Accurate Just-In-Time Defect Predictor
Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信