一种用于最优特征选择的混合包装蜘蛛猴优化模拟退火模型

International Journal of Reconfigurable and Embedded Systems (IJRES) Pub Date : 2023-11-01 DOI:10.11591/ijres.v12.i3.pp360-375

Bibhuprasad Sahu, Amrutanshu Panigrahi, Bibhu Dash, Pawan Kumar Sharma, Abhilash Pati

{"title":"一种用于最优特征选择的混合包装蜘蛛猴优化模拟退火模型","authors":"Bibhuprasad Sahu, Amrutanshu Panigrahi, Bibhu Dash, Pawan Kumar Sharma, Abhilash Pati","doi":"10.11591/ijres.v12.i3.pp360-375","DOIUrl":null,"url":null,"abstract":"In this research, a hybrid wrapper model is proposed to identify the featured gene subset from the gene expression data. To balance the gap between exploration and exploitation, a hybrid model with a popular meta-heuristic algorithm named spider monkey optimizer (SMO) and simulated annealing (SA) is applied. In the proposed model, ReliefF is used as a filter to obtain the relevant gene subset from dataset by removing the noise and outliers prior to feeding the data to the wrapper SMO. To enhance the quality of the solution, simulated annealing is deployed as local search with the SMO in the second phase, which will guide to the detection of the most optimal feature subset. To evaluate the performance of the proposed model, support vector machine (SVM) as a fitness function to recognize the most informative biomarker gene from the cancer datasets along with University of California, Irvine (UCI) datasets. To further evaluate the model, 4 different classifiers (SVM, na¨ıve Bayes (NB), decision tree (DT), and k-nearest neighbors (KNN)) are used. From the experimental results and analysis, it’s noteworthy to accept that the ReliefF-SO-SA-SVM performs relatively better than its state-of-the-art counterparts. For cancer datasets, our model performs better in terms of accuracy with a maximum of 99.45%.","PeriodicalId":158991,"journal":{"name":"International Journal of Reconfigurable and Embedded Systems (IJRES)","volume":"98 17","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A hybrid wrapper spider monkey optimization-simulated annealing model for optimal feature selection\",\"authors\":\"Bibhuprasad Sahu, Amrutanshu Panigrahi, Bibhu Dash, Pawan Kumar Sharma, Abhilash Pati\",\"doi\":\"10.11591/ijres.v12.i3.pp360-375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this research, a hybrid wrapper model is proposed to identify the featured gene subset from the gene expression data. To balance the gap between exploration and exploitation, a hybrid model with a popular meta-heuristic algorithm named spider monkey optimizer (SMO) and simulated annealing (SA) is applied. In the proposed model, ReliefF is used as a filter to obtain the relevant gene subset from dataset by removing the noise and outliers prior to feeding the data to the wrapper SMO. To enhance the quality of the solution, simulated annealing is deployed as local search with the SMO in the second phase, which will guide to the detection of the most optimal feature subset. To evaluate the performance of the proposed model, support vector machine (SVM) as a fitness function to recognize the most informative biomarker gene from the cancer datasets along with University of California, Irvine (UCI) datasets. To further evaluate the model, 4 different classifiers (SVM, na¨ıve Bayes (NB), decision tree (DT), and k-nearest neighbors (KNN)) are used. From the experimental results and analysis, it’s noteworthy to accept that the ReliefF-SO-SA-SVM performs relatively better than its state-of-the-art counterparts. For cancer datasets, our model performs better in terms of accuracy with a maximum of 99.45%.\",\"PeriodicalId\":158991,\"journal\":{\"name\":\"International Journal of Reconfigurable and Embedded Systems (IJRES)\",\"volume\":\"98 17\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Reconfigurable and Embedded Systems (IJRES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/ijres.v12.i3.pp360-375\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Reconfigurable and Embedded Systems (IJRES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijres.v12.i3.pp360-375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本研究中，提出了一种混合包装模型来从基因表达数据中识别特征基因子集。为了平衡探索和开发之间的差距，采用了一种由流行的元启发式算法蜘蛛猴优化器(SMO)和模拟退火(SA)组成的混合模型。在该模型中，ReliefF作为过滤器，在将数据馈送到包装器SMO之前，通过去除噪声和异常值，从数据集中获得相关的基因子集。为了提高解决方案的质量，第二阶段将模拟退火作为SMO的局部搜索，这将指导最优特征子集的检测。为了评估所提出的模型的性能，支持向量机(SVM)作为适应度函数从癌症数据集以及加州大学欧文分校(UCI)数据集中识别出信息量最大的生物标志物基因。为了进一步评估模型，使用了4种不同的分类器(SVM， ıve贝叶斯(NB)，决策树(DT)和k近邻(KNN))。从实验结果和分析来看，值得注意的是，relief - so - sa - svm的性能相对较好。对于癌症数据集，我们的模型在准确率方面表现更好，最高可达99.45%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A hybrid wrapper spider monkey optimization-simulated annealing model for optimal feature selection

In this research, a hybrid wrapper model is proposed to identify the featured gene subset from the gene expression data. To balance the gap between exploration and exploitation, a hybrid model with a popular meta-heuristic algorithm named spider monkey optimizer (SMO) and simulated annealing (SA) is applied. In the proposed model, ReliefF is used as a filter to obtain the relevant gene subset from dataset by removing the noise and outliers prior to feeding the data to the wrapper SMO. To enhance the quality of the solution, simulated annealing is deployed as local search with the SMO in the second phase, which will guide to the detection of the most optimal feature subset. To evaluate the performance of the proposed model, support vector machine (SVM) as a fitness function to recognize the most informative biomarker gene from the cancer datasets along with University of California, Irvine (UCI) datasets. To further evaluate the model, 4 different classifiers (SVM, na¨ıve Bayes (NB), decision tree (DT), and k-nearest neighbors (KNN)) are used. From the experimental results and analysis, it’s noteworthy to accept that the ReliefF-SO-SA-SVM performs relatively better than its state-of-the-art counterparts. For cancer datasets, our model performs better in terms of accuracy with a maximum of 99.45%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Reconfigurable and Embedded Systems (IJRES)

CiteScore

1.50

自引率

0.00%

发文量