基于对数生存全局搜索的生物医用蛇形优化系统特征选择优化及其在疾病识别中的应用

IF 4.8 2区 工程技术 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Ruba Abu Khurma, Esraa Alhenawi, Malik Braik, Fatma A Hashim, Amit Chhabra, Pedro A Castillo
{"title":"基于对数生存全局搜索的生物医用蛇形优化系统特征选择优化及其在疾病识别中的应用","authors":"Ruba Abu Khurma, Esraa Alhenawi, Malik Braik, Fatma A Hashim, Amit Chhabra, Pedro A Castillo","doi":"10.1093/jcde/qwad101","DOIUrl":null,"url":null,"abstract":"Abstract It is of paramount importance to enhance medical practices, given how important it is to protect human life. Medical therapy can be accelerated by automating patient prediction using machine learning techniques. To double the efficiency of classifiers, several preprocessing strategies must be adopted for their crucial duty in this field. Feature selection (FS) is one tool that has been used frequently to modify data and enhance classification outcomes by lowering the dimensionality of datasets. Excluded features are those that have a poor correlation coefficient with the label class, that is, they have no meaningful correlation with classification and do not indicate where the instance belongs. Along with the recurring features, which show a strong association with the remainder of the features. Contrarily, the model being produced during training is harmed, and the classifier is misled by their presence. This causes overfitting and increases algorithm complexity and processing time. The pattern is made clearer by FS, which also creates a broader classification model with a lower chance of overfitting in an acceptable amount of time and algorithmic complexity. To optimize the FS process, building wrappers must employ metaheuristic algorithms (MAs) as search algorithms. The best solution, which reflects the best subset of features within a particular medical dataset that aids in patient diagnosis, is sought in this study using the Snake Optimizer (SO). The swarm-based approaches that SO is founded on have left it with several general flaws, like local minimum trapping, early convergence, uneven exploration and exploitation, and early convergence. By employing the cosine function to calculate the separation between the present solution and the ideal solution, the logarithm operator was paired with SO to better the exploitation process and get over these restrictions. In order to get the best overall answer, this forces the solutions to spiral downward. Additionally, SO is employed to put the evolutionary algorithms’ preservation of the best premise into practice. This is accomplished by utilizing three alternative selection systems tournament, proportional, and linear to improve the exploration phase. These are used in exploration to allow solutions to be found more thoroughly and in relation to a chosen solution than at random. TLSO, PLSO, and LLSO stand for Tournament Logarithmic Snake Optimizer, Proportional Logarithmic Snake Optimizer, and Linear Order Logarithmic Snake Optimizer, respectively. A number of 22 reference medical datasets were used in experiments. The findings indicate that, among 86% of the datasets, TLSO attained the best accuracy, and among 82% of the datasets, the best feature reduction. In terms of the standard deviation, the TLSO also attained noteworthy reliability and stability. On the basis of running duration, it is, nonetheless, quite effective.","PeriodicalId":48611,"journal":{"name":"Journal of Computational Design and Engineering","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Bio-Medical Snake Optimizer System Driven by Logarithmic Surviving Global Search for Optimizing Feature Selection and its application for Disorder Recognition\",\"authors\":\"Ruba Abu Khurma, Esraa Alhenawi, Malik Braik, Fatma A Hashim, Amit Chhabra, Pedro A Castillo\",\"doi\":\"10.1093/jcde/qwad101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract It is of paramount importance to enhance medical practices, given how important it is to protect human life. Medical therapy can be accelerated by automating patient prediction using machine learning techniques. To double the efficiency of classifiers, several preprocessing strategies must be adopted for their crucial duty in this field. Feature selection (FS) is one tool that has been used frequently to modify data and enhance classification outcomes by lowering the dimensionality of datasets. Excluded features are those that have a poor correlation coefficient with the label class, that is, they have no meaningful correlation with classification and do not indicate where the instance belongs. Along with the recurring features, which show a strong association with the remainder of the features. Contrarily, the model being produced during training is harmed, and the classifier is misled by their presence. This causes overfitting and increases algorithm complexity and processing time. The pattern is made clearer by FS, which also creates a broader classification model with a lower chance of overfitting in an acceptable amount of time and algorithmic complexity. To optimize the FS process, building wrappers must employ metaheuristic algorithms (MAs) as search algorithms. The best solution, which reflects the best subset of features within a particular medical dataset that aids in patient diagnosis, is sought in this study using the Snake Optimizer (SO). The swarm-based approaches that SO is founded on have left it with several general flaws, like local minimum trapping, early convergence, uneven exploration and exploitation, and early convergence. By employing the cosine function to calculate the separation between the present solution and the ideal solution, the logarithm operator was paired with SO to better the exploitation process and get over these restrictions. In order to get the best overall answer, this forces the solutions to spiral downward. Additionally, SO is employed to put the evolutionary algorithms’ preservation of the best premise into practice. This is accomplished by utilizing three alternative selection systems tournament, proportional, and linear to improve the exploration phase. These are used in exploration to allow solutions to be found more thoroughly and in relation to a chosen solution than at random. TLSO, PLSO, and LLSO stand for Tournament Logarithmic Snake Optimizer, Proportional Logarithmic Snake Optimizer, and Linear Order Logarithmic Snake Optimizer, respectively. A number of 22 reference medical datasets were used in experiments. The findings indicate that, among 86% of the datasets, TLSO attained the best accuracy, and among 82% of the datasets, the best feature reduction. In terms of the standard deviation, the TLSO also attained noteworthy reliability and stability. On the basis of running duration, it is, nonetheless, quite effective.\",\"PeriodicalId\":48611,\"journal\":{\"name\":\"Journal of Computational Design and Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2023-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Design and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jcde/qwad101\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Design and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jcde/qwad101","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

鉴于保护人类生命的重要性,加强医疗实践是至关重要的。通过使用机器学习技术自动化患者预测,可以加速医学治疗。为了使分类器的效率提高一倍,必须采用几种预处理策略来完成分类器在该领域的关键任务。特征选择(FS)是一种常用的工具,可以通过降低数据集的维数来修改数据并增强分类结果。排除的特征是那些与标签类相关系数较差的特征,即它们与分类没有有意义的相关性,并且不能指示实例所属的位置。与重复出现的特征一起,显示出与其余特征的强烈关联。相反,在训练过程中产生的模型受到损害,分类器被它们的存在误导。这会导致过拟合,增加算法复杂度和处理时间。FS使模式更清晰,它还创建了一个更广泛的分类模型,在可接受的时间和算法复杂度内,过拟合的可能性更低。为了优化FS过程,构建包装器必须使用元启发式算法(meta - heuristic algorithms, MAs)作为搜索算法。在本研究中,使用Snake优化器(SO)寻求最佳解决方案,该解决方案反映了特定医疗数据集中有助于患者诊断的最佳特征子集。基于群体的SO方法存在一些普遍缺陷,如局部最小捕获、早期收敛、不均匀勘探和开发以及早期收敛。利用余弦函数计算当前解与理想解之间的距离,将对数算子与SO配对,以改进开发过程,克服这些限制。为了得到最好的整体答案,这迫使解决方案螺旋式下降。此外,采用SO将进化算法对最佳前提的保留付诸实践。这是通过利用竞赛、比例和线性三种选择系统来改进探索阶段来实现的。在探索中使用这些方法,以便更彻底地找到解决方案,并与选定的解决方案相关联,而不是随机地找到解决方案。TLSO, PLSO和LLSO分别代表锦标赛对数蛇优化器,比例对数蛇优化器和线性顺序对数蛇优化器。实验使用了22个参考医学数据集。结果表明,在86%的数据集中,TLSO达到了最好的准确率,在82%的数据集中,TLSO达到了最好的特征约简。在标准差方面,TLSO也取得了值得注意的可靠性和稳定性。尽管如此,从运行时间来看,它还是相当有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Bio-Medical Snake Optimizer System Driven by Logarithmic Surviving Global Search for Optimizing Feature Selection and its application for Disorder Recognition
Abstract It is of paramount importance to enhance medical practices, given how important it is to protect human life. Medical therapy can be accelerated by automating patient prediction using machine learning techniques. To double the efficiency of classifiers, several preprocessing strategies must be adopted for their crucial duty in this field. Feature selection (FS) is one tool that has been used frequently to modify data and enhance classification outcomes by lowering the dimensionality of datasets. Excluded features are those that have a poor correlation coefficient with the label class, that is, they have no meaningful correlation with classification and do not indicate where the instance belongs. Along with the recurring features, which show a strong association with the remainder of the features. Contrarily, the model being produced during training is harmed, and the classifier is misled by their presence. This causes overfitting and increases algorithm complexity and processing time. The pattern is made clearer by FS, which also creates a broader classification model with a lower chance of overfitting in an acceptable amount of time and algorithmic complexity. To optimize the FS process, building wrappers must employ metaheuristic algorithms (MAs) as search algorithms. The best solution, which reflects the best subset of features within a particular medical dataset that aids in patient diagnosis, is sought in this study using the Snake Optimizer (SO). The swarm-based approaches that SO is founded on have left it with several general flaws, like local minimum trapping, early convergence, uneven exploration and exploitation, and early convergence. By employing the cosine function to calculate the separation between the present solution and the ideal solution, the logarithm operator was paired with SO to better the exploitation process and get over these restrictions. In order to get the best overall answer, this forces the solutions to spiral downward. Additionally, SO is employed to put the evolutionary algorithms’ preservation of the best premise into practice. This is accomplished by utilizing three alternative selection systems tournament, proportional, and linear to improve the exploration phase. These are used in exploration to allow solutions to be found more thoroughly and in relation to a chosen solution than at random. TLSO, PLSO, and LLSO stand for Tournament Logarithmic Snake Optimizer, Proportional Logarithmic Snake Optimizer, and Linear Order Logarithmic Snake Optimizer, respectively. A number of 22 reference medical datasets were used in experiments. The findings indicate that, among 86% of the datasets, TLSO attained the best accuracy, and among 82% of the datasets, the best feature reduction. In terms of the standard deviation, the TLSO also attained noteworthy reliability and stability. On the basis of running duration, it is, nonetheless, quite effective.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Computational Design and Engineering
Journal of Computational Design and Engineering Computer Science-Human-Computer Interaction
CiteScore
7.70
自引率
20.40%
发文量
125
期刊介绍: Journal of Computational Design and Engineering is an international journal that aims to provide academia and industry with a venue for rapid publication of research papers reporting innovative computational methods and applications to achieve a major breakthrough, practical improvements, and bold new research directions within a wide range of design and engineering: • Theory and its progress in computational advancement for design and engineering • Development of computational framework to support large scale design and engineering • Interaction issues among human, designed artifacts, and systems • Knowledge-intensive technologies for intelligent and sustainable systems • Emerging technology and convergence of technology fields presented with convincing design examples • Educational issues for academia, practitioners, and future generation • Proposal on new research directions as well as survey and retrospectives on mature field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信