Sharif Naser Makhadmeh , Yousef Sanjalawe , Mohammed Azmi Al-Betar , Ahmad Nasayreh , Mohammad Aladaileh
{"title":"A crossover-enhanced Marine Predators Algorithm for gene selection in microarray-based cancer classification","authors":"Sharif Naser Makhadmeh , Yousef Sanjalawe , Mohammed Azmi Al-Betar , Ahmad Nasayreh , Mohammad Aladaileh","doi":"10.1016/j.ailsci.2025.100140","DOIUrl":null,"url":null,"abstract":"<div><div>The DNA microarray technique involves using a chip embedded with numerous DNA sequences to simultaneously estimate the expression of a multitude of genes. This data, laid out in table format, is vital for employing pattern recognition algorithms that distinguish between samples from healthy individuals and those with cancer. However, identifying useful biomarkers within gene selection data presents significant challenges due to its vast dimensionality and the inclusion of noisy, irrelevant genes. To address these challenges, this paper introduces a sophisticated gene selection method using a robust filter called Minimum redundancy maximum relevancy, combined with a novel hybrid optimization algorithm. This algorithm integrates the Improved Marine Predator Optimizer (MPA) with the Crossover operator to form the MPAC method. The MPAC specifically aims to identify a concise set of biomarker genes that substantially improve cancer classification performance. It employs the k-nearest neighbor algorithm for classification tasks. The innovation in MPAC lies in its ability to significantly enhance the performance of the MPA’s search agents. It seeks the most effective gene subsets for cancer biomarkers and is designed to optimize both the depth (exploitation) and breadth (exploration) of the search. The effectiveness of this hybrid approach is rigorously tested against nine well-known microarray datasets. The performance of this hybrid model is compared against other base and advanced optimization algorithms. The findings from these comparisons highlight that the proposed MPAC approach excels in most of the datasets and remains highly competitive across the others.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100140"},"PeriodicalIF":5.4000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318525000169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The DNA microarray technique involves using a chip embedded with numerous DNA sequences to simultaneously estimate the expression of a multitude of genes. This data, laid out in table format, is vital for employing pattern recognition algorithms that distinguish between samples from healthy individuals and those with cancer. However, identifying useful biomarkers within gene selection data presents significant challenges due to its vast dimensionality and the inclusion of noisy, irrelevant genes. To address these challenges, this paper introduces a sophisticated gene selection method using a robust filter called Minimum redundancy maximum relevancy, combined with a novel hybrid optimization algorithm. This algorithm integrates the Improved Marine Predator Optimizer (MPA) with the Crossover operator to form the MPAC method. The MPAC specifically aims to identify a concise set of biomarker genes that substantially improve cancer classification performance. It employs the k-nearest neighbor algorithm for classification tasks. The innovation in MPAC lies in its ability to significantly enhance the performance of the MPA’s search agents. It seeks the most effective gene subsets for cancer biomarkers and is designed to optimize both the depth (exploitation) and breadth (exploration) of the search. The effectiveness of this hybrid approach is rigorously tested against nine well-known microarray datasets. The performance of this hybrid model is compared against other base and advanced optimization algorithms. The findings from these comparisons highlight that the proposed MPAC approach excels in most of the datasets and remains highly competitive across the others.
Artificial intelligence in the life sciencesPharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)