{"title":"PRFE-driven gene selection with multi-classifier ensemble for cancer classification","authors":"Smitirekha Behuria , Sujata Swain , Anjan Bandyopadhyay , Mohammad Khalid Al-Sadoon , Saurav Mallik","doi":"10.1016/j.eij.2025.100637","DOIUrl":null,"url":null,"abstract":"<div><div>In this era, cancer remains a paramount concern due to its pervasive impact on individuals and societies, persistent challenges in treatment and prevention, and the ongoing need for global collaboration and innovation to improve outcomes and reduce its burden. Cancer marked by uncontrolled cell growth is a leading global cause of mortality, necessitating advanced methods for accurate diagnosis. This study introduces an innovative unsupervised feature selection technique Principal Recursive Feature Eliminator (PRFE) for selection of genes and cancer classification. Subsequently, seven different classifiers—Support Vector Machine, Random Forest, CatBoost, Light Gradient Boosting Method, Artificial Neural Network, Convolutional Neural Network, Long Short-Term Memory are used to increase the model’s robustness. The proposed approach is evaluated on nine benchmark gene expression datasets with a combination of two different algorithms. A series of experiments are conducted to assess the proposed method, focusing on the selected features and identifying optimal classifiers. We have calculated F1-Score, accuracy, recall, and precision. The suggested strategy performs better than expected, as the results highlight its potential to improve cancer classification techniques with an accuracy of 99.98%. We conclude from this analysis that, across many datasets, the CatBoost and CNN model outperforms the other models. This research contributes to the ongoing efforts to improve diagnostic precision and treatment outcomes in cancer research.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"30 ","pages":"Article 100637"},"PeriodicalIF":5.0000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866525000301","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this era, cancer remains a paramount concern due to its pervasive impact on individuals and societies, persistent challenges in treatment and prevention, and the ongoing need for global collaboration and innovation to improve outcomes and reduce its burden. Cancer marked by uncontrolled cell growth is a leading global cause of mortality, necessitating advanced methods for accurate diagnosis. This study introduces an innovative unsupervised feature selection technique Principal Recursive Feature Eliminator (PRFE) for selection of genes and cancer classification. Subsequently, seven different classifiers—Support Vector Machine, Random Forest, CatBoost, Light Gradient Boosting Method, Artificial Neural Network, Convolutional Neural Network, Long Short-Term Memory are used to increase the model’s robustness. The proposed approach is evaluated on nine benchmark gene expression datasets with a combination of two different algorithms. A series of experiments are conducted to assess the proposed method, focusing on the selected features and identifying optimal classifiers. We have calculated F1-Score, accuracy, recall, and precision. The suggested strategy performs better than expected, as the results highlight its potential to improve cancer classification techniques with an accuracy of 99.98%. We conclude from this analysis that, across many datasets, the CatBoost and CNN model outperforms the other models. This research contributes to the ongoing efforts to improve diagnostic precision and treatment outcomes in cancer research.
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.