不平衡数据集问题的改进两步监督学习人工神经网络

2011 Third International Conference on Computational Intelligence, Modelling & Simulation Pub Date : 2011-09-20 DOI:10.1109/CIMSIM.2011.28

H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid

{"title":"不平衡数据集问题的改进两步监督学习人工神经网络","authors":"H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid","doi":"10.1109/CIMSIM.2011.28","DOIUrl":null,"url":null,"abstract":"An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.","PeriodicalId":125671,"journal":{"name":"2011 Third International Conference on Computational Intelligence, Modelling & Simulation","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"An Improved Two-Step Supervised Learning Artificial Neural Network for Imbalanced Dataset Problems\",\"authors\":\"H. Shamsudin, Asrul Adam, M. I. Shapiai, M. Basri, Z. Ibrahim, M. Khalid\",\"doi\":\"10.1109/CIMSIM.2011.28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.\",\"PeriodicalId\":125671,\"journal\":{\"name\":\"2011 Third International Conference on Computational Intelligence, Modelling & Simulation\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 Third International Conference on Computational Intelligence, Modelling & Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIMSIM.2011.28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Third International Conference on Computational Intelligence, Modelling & Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMSIM.2011.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

针对不平衡数据集问题，提出了一种改进的人工神经网络两步监督学习算法。采用粒子群算法作为人工神经网络第一步和第二步的学习机制。这两个步骤的适应度函数是几何平均(G-Mean)。首先，将决策阈值设置为0.5，确定网络上的最佳权值;完成第一步学习后，将使用最佳权值进行第二步学习。得到具有最佳决策阈值的最佳权重，并可用于预测不平衡数据集。Haberman的生存数据集，可在UCI机器学习存储库中获得，被选为案例研究。在案例研究中，选择g均值作为评价方法来定义分类器的性能。因此，与之前提出的人工神经网络相比，该方法能够以更好的G-Mean值克服数据集不平衡问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Improved Two-Step Supervised Learning Artificial Neural Network for Imbalanced Dataset Problems

An improved two-step supervised learning algorithm of Artificial Neural Networks (ANN) for imbalanced dataset problems is proposed in this paper. Particle swarm optimization (PSO) is utilized as ANN learning mechanism for first step and second step. The fitness function for both steps is Geometric Mean (G-Mean). Firstly, the best weights on network are determined with a decision threshold is set to 0.5. After the first step learning is accomplished, the best weights will be used for second step learning. The best weights with the best value of decision threshold are obtained and can be used to predict an imbalanced dataset. Haberman's Survival datasets, which is available in UCI Machine Learning Repository, is chosen as a case study. G-Mean is chosen as the evaluation method to define the classifier's performance for a case study. Consequently, the proposed approach is able to overcome imbalanced dataset problems with better G-Mean value compared to the previously proposed ANN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 Third International Conference on Computational Intelligence, Modelling & Simulation

自引率

0.00%

发文量