简单还是复杂?一起获得更准确的及时缺陷预测器

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC) Pub Date : 2022-05-01 DOI:10.1145/3524610.3527910

Xin Zhou, Donggyun Han, David Lo

{"title":"简单还是复杂?一起获得更准确的及时缺陷预测器","authors":"Xin Zhou, Donggyun Han, David Lo","doi":"10.1145/3524610.3527910","DOIUrl":null,"url":null,"abstract":"Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Simple or Complex? Together for a More Accurate Just-In-Time Defect Predictor\",\"authors\":\"Xin Zhou, Donggyun Han, David Lo\",\"doi\":\"10.1145/3524610.3527910\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.\",\"PeriodicalId\":426634,\"journal\":{\"name\":\"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3524610.3527910\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527910","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

即时缺陷预测(JIT)旨在自动预测提交是否存在缺陷，近年来得到了广泛的研究。一般来说，大多数研究可以分为两类:1)使用传统机器学习分类器和手工制作特征的简单模型，以及2)使用深度学习技术自动提取特征的复杂模型。简单模型使用的手工特征基于专家知识，但可能不能完全表示提交的语义含义。另一方面，复杂模型使用的基于深度学习的特征表示提交的语义含义，但可能无法反映有用的专家知识。简单模型和复杂模型似乎在某种程度上是互补的。为了利用简单模型和复杂模型的优点，我们提出了一种将一个简单模型和一个复杂模型的预测分数融合的组合模型SimCom。实验结果表明，我们的方法可以显著优于目前最先进的6.0-18.1%。此外，我们的实验结果证实了简单模型和复杂模型是互补的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Simple or Complex? Together for a More Accurate Just-In-Time Defect Predictor

Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

自引率

0.00%

发文量