Jianqing Lin;Cheng He;Hanjing Jiang;Yabing Huang;Yaochu Jin
{"title":"从大规模单细胞RNA测序数据中辅助多目标基因选择进行细胞分类","authors":"Jianqing Lin;Cheng He;Hanjing Jiang;Yabing Huang;Yaochu Jin","doi":"10.1109/TEVC.2025.3533490","DOIUrl":null,"url":null,"abstract":"Accurate cell classification is crucial but expensive for large-scale single-cell RNA sequencing (scRNA-seq) analysis. Gene selection (GS) emerges as a pivotal technique in identifying gene subsets of scRNA-seq for classification accuracy improvement and gene scale reduction. Nevertheless, the rising scale of scRNA-seq data presents challenges to existing GS methods regarding performance and computational time. Thus, we propose a surrogate-assisted evolutionary algorithm for multiobjective GS to address these deficiencies. An innovative two-phase initialization method is proposed to select sparse solutions to provide preliminary insights into gene contributions. Then, a binary competitive swarm optimizer is proposed for effective global search, where a local search method is embedded to eliminate irrelevant genes for efficiency consideration. Additionally, a surrogate model is adopted to forecast classification accuracy efficiently and substitutes part of the computationally expensive classification process. Experiments are conducted on eight large-scale scRNA-seq datasets with more than 20 000 genes. The effectiveness of the proposed GS method for scRNA-seq cell classification compared with eight state-of-the-art methods is validated. Gene expression analysis results of selected genes further validated the significance of the genes selected by the proposed method in the classification of scRNA-seq data.","PeriodicalId":13206,"journal":{"name":"IEEE Transactions on Evolutionary Computation","volume":"29 3","pages":"601-615"},"PeriodicalIF":11.7000,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Surrogate-Assisted Multiobjective Gene Selection for Cell Classification From Large-Scale Single-Cell RNA Sequencing Data\",\"authors\":\"Jianqing Lin;Cheng He;Hanjing Jiang;Yabing Huang;Yaochu Jin\",\"doi\":\"10.1109/TEVC.2025.3533490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate cell classification is crucial but expensive for large-scale single-cell RNA sequencing (scRNA-seq) analysis. Gene selection (GS) emerges as a pivotal technique in identifying gene subsets of scRNA-seq for classification accuracy improvement and gene scale reduction. Nevertheless, the rising scale of scRNA-seq data presents challenges to existing GS methods regarding performance and computational time. Thus, we propose a surrogate-assisted evolutionary algorithm for multiobjective GS to address these deficiencies. An innovative two-phase initialization method is proposed to select sparse solutions to provide preliminary insights into gene contributions. Then, a binary competitive swarm optimizer is proposed for effective global search, where a local search method is embedded to eliminate irrelevant genes for efficiency consideration. Additionally, a surrogate model is adopted to forecast classification accuracy efficiently and substitutes part of the computationally expensive classification process. Experiments are conducted on eight large-scale scRNA-seq datasets with more than 20 000 genes. The effectiveness of the proposed GS method for scRNA-seq cell classification compared with eight state-of-the-art methods is validated. Gene expression analysis results of selected genes further validated the significance of the genes selected by the proposed method in the classification of scRNA-seq data.\",\"PeriodicalId\":13206,\"journal\":{\"name\":\"IEEE Transactions on Evolutionary Computation\",\"volume\":\"29 3\",\"pages\":\"601-615\"},\"PeriodicalIF\":11.7000,\"publicationDate\":\"2025-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Evolutionary Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10852178/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10852178/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Surrogate-Assisted Multiobjective Gene Selection for Cell Classification From Large-Scale Single-Cell RNA Sequencing Data
Accurate cell classification is crucial but expensive for large-scale single-cell RNA sequencing (scRNA-seq) analysis. Gene selection (GS) emerges as a pivotal technique in identifying gene subsets of scRNA-seq for classification accuracy improvement and gene scale reduction. Nevertheless, the rising scale of scRNA-seq data presents challenges to existing GS methods regarding performance and computational time. Thus, we propose a surrogate-assisted evolutionary algorithm for multiobjective GS to address these deficiencies. An innovative two-phase initialization method is proposed to select sparse solutions to provide preliminary insights into gene contributions. Then, a binary competitive swarm optimizer is proposed for effective global search, where a local search method is embedded to eliminate irrelevant genes for efficiency consideration. Additionally, a surrogate model is adopted to forecast classification accuracy efficiently and substitutes part of the computationally expensive classification process. Experiments are conducted on eight large-scale scRNA-seq datasets with more than 20 000 genes. The effectiveness of the proposed GS method for scRNA-seq cell classification compared with eight state-of-the-art methods is validated. Gene expression analysis results of selected genes further validated the significance of the genes selected by the proposed method in the classification of scRNA-seq data.
期刊介绍:
The IEEE Transactions on Evolutionary Computation is published by the IEEE Computational Intelligence Society on behalf of 13 societies: Circuits and Systems; Computer; Control Systems; Engineering in Medicine and Biology; Industrial Electronics; Industry Applications; Lasers and Electro-Optics; Oceanic Engineering; Power Engineering; Robotics and Automation; Signal Processing; Social Implications of Technology; and Systems, Man, and Cybernetics. The journal publishes original papers in evolutionary computation and related areas such as nature-inspired algorithms, population-based methods, optimization, and hybrid systems. It welcomes both purely theoretical papers and application papers that provide general insights into these areas of computation.