不平衡数据下软件缺陷预测的多种群协同进化方法。

L. Bui, V. Vu, Bich Van Pham, V. Phan
{"title":"不平衡数据下软件缺陷预测的多种群协同进化方法。","authors":"L. Bui, V. Vu, Bich Van Pham, V. Phan","doi":"10.1109/KSE56063.2022.9953798","DOIUrl":null,"url":null,"abstract":"This paper proposes a cooperative coevolutionary approach namely COESDP to the software defect prediction (SDP) problem. The proposed method consists of three main phases. The first one conducts data preprocessing including data sampling and cleaning. The second phase utilizes a multi-population coevolutionary approach (MPCA) to find out optimal instance selection solutions. These first two phases help to deal with the imbalanced data challenge of the SDP problem. While the data sampling method aids in the creation of a more balanced data set, MPCA supports in the elimination of unnecessary data samples (or instances) and the selection of crucial instances. The output of phase 2 is a set of different optimal solutions. Each solution is a way of selecting instances from which to create a classifier (or weak learners). Phase 3 utilizes an ensemble learning method to combine these weak learners and produce the final result. The proposed algorithm is compared with conventional machine learning algorithms, ensemble learning algorithms, computational intelligence algorithms and an other multi-population algorithm on 6 standard SDP datasets. Experimental results show that the proposed method gives better and more stable results in comparison with other methods and it can tackle the challenge of imbalance in the SDP data.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A multi-population coevolutionary approach for Software defect prediction with imbalanced data.\",\"authors\":\"L. Bui, V. Vu, Bich Van Pham, V. Phan\",\"doi\":\"10.1109/KSE56063.2022.9953798\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a cooperative coevolutionary approach namely COESDP to the software defect prediction (SDP) problem. The proposed method consists of three main phases. The first one conducts data preprocessing including data sampling and cleaning. The second phase utilizes a multi-population coevolutionary approach (MPCA) to find out optimal instance selection solutions. These first two phases help to deal with the imbalanced data challenge of the SDP problem. While the data sampling method aids in the creation of a more balanced data set, MPCA supports in the elimination of unnecessary data samples (or instances) and the selection of crucial instances. The output of phase 2 is a set of different optimal solutions. Each solution is a way of selecting instances from which to create a classifier (or weak learners). Phase 3 utilizes an ensemble learning method to combine these weak learners and produce the final result. The proposed algorithm is compared with conventional machine learning algorithms, ensemble learning algorithms, computational intelligence algorithms and an other multi-population algorithm on 6 standard SDP datasets. Experimental results show that the proposed method gives better and more stable results in comparison with other methods and it can tackle the challenge of imbalance in the SDP data.\",\"PeriodicalId\":330865,\"journal\":{\"name\":\"2022 14th International Conference on Knowledge and Systems Engineering (KSE)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Knowledge and Systems Engineering (KSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KSE56063.2022.9953798\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE56063.2022.9953798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

针对软件缺陷预测(SDP)问题,提出了一种协同进化方法COESDP。所提出的方法包括三个主要阶段。首先进行数据预处理,包括数据采样和清洗。第二阶段利用多种群协同进化方法(MPCA)寻找最优实例选择解。这前两个阶段有助于处理SDP问题的数据不平衡挑战。虽然数据采样方法有助于创建更平衡的数据集,但MPCA支持消除不必要的数据样本(或实例)并选择关键实例。阶段2的输出是一组不同的最优解。每种解决方案都是一种选择实例的方法,从中创建分类器(或弱学习器)。阶段3使用集成学习方法将这些弱学习器组合在一起并产生最终结果。在6个标准SDP数据集上,将该算法与传统的机器学习算法、集成学习算法、计算智能算法和另一种多种群算法进行了比较。实验结果表明,与其他方法相比,该方法能较好地解决SDP数据不平衡的问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A multi-population coevolutionary approach for Software defect prediction with imbalanced data.
This paper proposes a cooperative coevolutionary approach namely COESDP to the software defect prediction (SDP) problem. The proposed method consists of three main phases. The first one conducts data preprocessing including data sampling and cleaning. The second phase utilizes a multi-population coevolutionary approach (MPCA) to find out optimal instance selection solutions. These first two phases help to deal with the imbalanced data challenge of the SDP problem. While the data sampling method aids in the creation of a more balanced data set, MPCA supports in the elimination of unnecessary data samples (or instances) and the selection of crucial instances. The output of phase 2 is a set of different optimal solutions. Each solution is a way of selecting instances from which to create a classifier (or weak learners). Phase 3 utilizes an ensemble learning method to combine these weak learners and produce the final result. The proposed algorithm is compared with conventional machine learning algorithms, ensemble learning algorithms, computational intelligence algorithms and an other multi-population algorithm on 6 standard SDP datasets. Experimental results show that the proposed method gives better and more stable results in comparison with other methods and it can tackle the challenge of imbalance in the SDP data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信