不平衡数据集问题的改进两步监督学习人工神经网络

H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid
{"title":"不平衡数据集问题的改进两步监督学习人工神经网络","authors":"H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid","doi":"10.1109/CIMSIM.2011.28","DOIUrl":null,"url":null,"abstract":"An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.","PeriodicalId":125671,"journal":{"name":"2011 Third International Conference on Computational Intelligence, Modelling & Simulation","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"An Improved Two-Step Supervised Learning Artificial Neural Network for Imbalanced Dataset Problems\",\"authors\":\"H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid\",\"doi\":\"10.1109/CIMSIM.2011.28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.\",\"PeriodicalId\":125671,\"journal\":{\"name\":\"2011 Third International Conference on Computational Intelligence, Modelling & Simulation\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 Third International Conference on Computational Intelligence, Modelling & Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIMSIM.2011.28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Third International Conference on Computational Intelligence, Modelling & Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMSIM.2011.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

针对不平衡数据集问题,提出了一种改进的人工神经网络两步监督学习算法。采用粒子群算法作为人工神经网络第一步和第二步的学习机制。这两个步骤的适应度函数是几何平均(G-Mean)。首先,将决策阈值设置为0.5,确定网络上的最佳权值;完成第一步学习后,将使用最佳权值进行第二步学习。得到具有最佳决策阈值的最佳权重,并可用于预测不平衡数据集。Haberman的生存数据集,可在UCI机器学习存储库中获得,被选为案例研究。在案例研究中,选择g均值作为评价方法来定义分类器的性能。因此,与之前提出的人工神经网络相比,该方法能够以更好的G-Mean值克服数据集不平衡问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Improved Two-Step Supervised Learning Artificial Neural Network for Imbalanced Dataset Problems
An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信