Jianqing Lin;Cheng He;Hanjing Jiang;Yabing Huang;Yaochu Jin
{"title":"Surrogate-Assisted Multiobjective Gene Selection for Cell Classification From Large-Scale Single-Cell RNA Sequencing Data","authors":"Jianqing Lin;Cheng He;Hanjing Jiang;Yabing Huang;Yaochu Jin","doi":"10.1109/TEVC.2025.3533490","DOIUrl":null,"url":null,"abstract":"Accurate cell classification is crucial but expensive for large-scale single-cell RNA sequencing (scRNA-seq) analysis. Gene selection (GS) emerges as a pivotal technique in identifying gene subsets of scRNA-seq for classification accuracy improvement and gene scale reduction. Nevertheless, the rising scale of scRNA-seq data presents challenges to existing GS methods regarding performance and computational time. Thus, we propose a surrogate-assisted evolutionary algorithm for multiobjective GS to address these deficiencies. An innovative two-phase initialization method is proposed to select sparse solutions to provide preliminary insights into gene contributions. Then, a binary competitive swarm optimizer is proposed for effective global search, where a local search method is embedded to eliminate irrelevant genes for efficiency consideration. Additionally, a surrogate model is adopted to forecast classification accuracy efficiently and substitutes part of the computationally expensive classification process. Experiments are conducted on eight large-scale scRNA-seq datasets with more than 20 000 genes. The effectiveness of the proposed GS method for scRNA-seq cell classification compared with eight state-of-the-art methods is validated. Gene expression analysis results of selected genes further validated the significance of the genes selected by the proposed method in the classification of scRNA-seq data.","PeriodicalId":13206,"journal":{"name":"IEEE Transactions on Evolutionary Computation","volume":"29 3","pages":"601-615"},"PeriodicalIF":11.7000,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10852178/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate cell classification is crucial but expensive for large-scale single-cell RNA sequencing (scRNA-seq) analysis. Gene selection (GS) emerges as a pivotal technique in identifying gene subsets of scRNA-seq for classification accuracy improvement and gene scale reduction. Nevertheless, the rising scale of scRNA-seq data presents challenges to existing GS methods regarding performance and computational time. Thus, we propose a surrogate-assisted evolutionary algorithm for multiobjective GS to address these deficiencies. An innovative two-phase initialization method is proposed to select sparse solutions to provide preliminary insights into gene contributions. Then, a binary competitive swarm optimizer is proposed for effective global search, where a local search method is embedded to eliminate irrelevant genes for efficiency consideration. Additionally, a surrogate model is adopted to forecast classification accuracy efficiently and substitutes part of the computationally expensive classification process. Experiments are conducted on eight large-scale scRNA-seq datasets with more than 20 000 genes. The effectiveness of the proposed GS method for scRNA-seq cell classification compared with eight state-of-the-art methods is validated. Gene expression analysis results of selected genes further validated the significance of the genes selected by the proposed method in the classification of scRNA-seq data.
期刊介绍:
The IEEE Transactions on Evolutionary Computation is published by the IEEE Computational Intelligence Society on behalf of 13 societies: Circuits and Systems; Computer; Control Systems; Engineering in Medicine and Biology; Industrial Electronics; Industry Applications; Lasers and Electro-Optics; Oceanic Engineering; Power Engineering; Robotics and Automation; Signal Processing; Social Implications of Technology; and Systems, Man, and Cybernetics. The journal publishes original papers in evolutionary computation and related areas such as nature-inspired algorithms, population-based methods, optimization, and hybrid systems. It welcomes both purely theoretical papers and application papers that provide general insights into these areas of computation.