A crossover-enhanced Marine Predators Algorithm for gene selection in microarray-based cancer classification

IF 5.4
Sharif Naser Makhadmeh , Yousef Sanjalawe , Mohammed Azmi Al-Betar , Ahmad Nasayreh , Mohammad Aladaileh
{"title":"A crossover-enhanced Marine Predators Algorithm for gene selection in microarray-based cancer classification","authors":"Sharif Naser Makhadmeh ,&nbsp;Yousef Sanjalawe ,&nbsp;Mohammed Azmi Al-Betar ,&nbsp;Ahmad Nasayreh ,&nbsp;Mohammad Aladaileh","doi":"10.1016/j.ailsci.2025.100140","DOIUrl":null,"url":null,"abstract":"<div><div>The DNA microarray technique involves using a chip embedded with numerous DNA sequences to simultaneously estimate the expression of a multitude of genes. This data, laid out in table format, is vital for employing pattern recognition algorithms that distinguish between samples from healthy individuals and those with cancer. However, identifying useful biomarkers within gene selection data presents significant challenges due to its vast dimensionality and the inclusion of noisy, irrelevant genes. To address these challenges, this paper introduces a sophisticated gene selection method using a robust filter called Minimum redundancy maximum relevancy, combined with a novel hybrid optimization algorithm. This algorithm integrates the Improved Marine Predator Optimizer (MPA) with the Crossover operator to form the MPAC method. The MPAC specifically aims to identify a concise set of biomarker genes that substantially improve cancer classification performance. It employs the k-nearest neighbor algorithm for classification tasks. The innovation in MPAC lies in its ability to significantly enhance the performance of the MPA’s search agents. It seeks the most effective gene subsets for cancer biomarkers and is designed to optimize both the depth (exploitation) and breadth (exploration) of the search. The effectiveness of this hybrid approach is rigorously tested against nine well-known microarray datasets. The performance of this hybrid model is compared against other base and advanced optimization algorithms. The findings from these comparisons highlight that the proposed MPAC approach excels in most of the datasets and remains highly competitive across the others.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100140"},"PeriodicalIF":5.4000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318525000169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The DNA microarray technique involves using a chip embedded with numerous DNA sequences to simultaneously estimate the expression of a multitude of genes. This data, laid out in table format, is vital for employing pattern recognition algorithms that distinguish between samples from healthy individuals and those with cancer. However, identifying useful biomarkers within gene selection data presents significant challenges due to its vast dimensionality and the inclusion of noisy, irrelevant genes. To address these challenges, this paper introduces a sophisticated gene selection method using a robust filter called Minimum redundancy maximum relevancy, combined with a novel hybrid optimization algorithm. This algorithm integrates the Improved Marine Predator Optimizer (MPA) with the Crossover operator to form the MPAC method. The MPAC specifically aims to identify a concise set of biomarker genes that substantially improve cancer classification performance. It employs the k-nearest neighbor algorithm for classification tasks. The innovation in MPAC lies in its ability to significantly enhance the performance of the MPA’s search agents. It seeks the most effective gene subsets for cancer biomarkers and is designed to optimize both the depth (exploitation) and breadth (exploration) of the search. The effectiveness of this hybrid approach is rigorously tested against nine well-known microarray datasets. The performance of this hybrid model is compared against other base and advanced optimization algorithms. The findings from these comparisons highlight that the proposed MPAC approach excels in most of the datasets and remains highly competitive across the others.
基于微阵列的癌症分类中基因选择的交叉增强海洋捕食者算法
DNA微阵列技术包括使用嵌入大量DNA序列的芯片来同时估计大量基因的表达。这些以表格形式列出的数据对于使用模式识别算法区分健康个体和癌症患者的样本至关重要。然而,在基因选择数据中识别有用的生物标志物面临着巨大的挑战,因为它的巨大维度和包含嘈杂的,不相关的基因。为了解决这些挑战,本文介绍了一种复杂的基因选择方法,该方法使用称为最小冗余最大相关性的鲁棒滤波器,并结合了一种新的混合优化算法。该算法将改进的海洋掠食者优化器(MPA)与交叉算子相结合,形成了MPAC方法。MPAC特别旨在鉴定一组简明的生物标记基因,这些基因可以大大提高癌症分类的性能。它采用k近邻算法进行分类任务。MPAC的创新之处在于它能够显著提高MPA搜索代理的性能。它为癌症生物标志物寻找最有效的基因亚群,旨在优化搜索的深度(开发)和广度(探索)。这种混合方法的有效性经过了针对九个知名微阵列数据集的严格测试。将该混合模型的性能与其他基本和先进的优化算法进行了比较。这些比较的结果突出表明,所提出的MPAC方法在大多数数据集中表现优异,并且在其他数据集中保持高度竞争力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial intelligence in the life sciences
Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
15 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信