Identification of Mitochondrial Dysfunction Genes as Diagnostic Biomarkers for Ischemic Stroke by Integrated Bioinformatics Analysis and Machine Learning
Dandan Wu, Xiaolan Huang, Jie Li, Dingmin Mo, Weiwei Lan, Zihan Song, Li Su, Jianxiong Long, Jialei Yang
{"title":"Identification of Mitochondrial Dysfunction Genes as Diagnostic Biomarkers for Ischemic Stroke by Integrated Bioinformatics Analysis and Machine Learning","authors":"Dandan Wu, Xiaolan Huang, Jie Li, Dingmin Mo, Weiwei Lan, Zihan Song, Li Su, Jianxiong Long, Jialei Yang","doi":"10.1007/s12031-026-02533-w","DOIUrl":null,"url":null,"abstract":"<div><p> Current diagnostics for ischemic stroke (IS) lack timeliness and accessibility, highlighting the need for novel molecular diagnostic models. Three gene expression datasets (GSE16561, GSE22255 and GSE58294), encompassing both IS patients and healthy control subjects, were retrieved from a public database. The mitochondrial dysfunction genes retrieve from the intersection of the GeneCards and MitoCarta3.0 databases. The limma and WGCNA package were used to obtain the genes related to IS. Feature genes were screened using LASSO, RF, SVM, and diagnostic models were constructed using NeighborMethod, NeuralNet, and BayesMethod. 3548 differentially expressed genes (DEGs) (1538 upregulated, 2010 downregulated) were identified in IS patients when compared to controls. WGCNA analysis yielded 10 IS-related modules containing 1643 genes. The intersection of DEGs, module genes, and mitochondrial dysfunction genes yielded 100 mitochondrial dysfunction genes associated with IS. These genes collectively regulate biological processes like mitochondrial ATP synthesis coupled electron transport and respiratory electron transport chain, and participate in IS-associated signaling pathways such as reactive oxygen species and oxidative phosphorylation. Further machine learning methods identified 4 feature genes, including <i>MCL1</i>, <i>MRPL46</i>, <i>MTX3</i> and <i>RNASEH1</i>. These four genes exhibited robust diagnostic potential in the merged dataset (all AUC > 0.7). The machine learning models achieved AUC values of 0.814 (NeighborMethod), 0.852 (NeuralNet), and 0.842 (BayesMethod). External validation using an independent cohort confirmed that all models maintained high diagnostic accuracy (AUC range: 0.730–0.783). This study established a multi-gene diagnostic model for IS, identifying novel molecular biomarkers to improve the timeliness and accessibility of IS diagnosis.</p></div>","PeriodicalId":652,"journal":{"name":"Journal of Molecular Neuroscience","volume":"76 2","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2026-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s12031-026-02533-w","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Current diagnostics for ischemic stroke (IS) lack timeliness and accessibility, highlighting the need for novel molecular diagnostic models. Three gene expression datasets (GSE16561, GSE22255 and GSE58294), encompassing both IS patients and healthy control subjects, were retrieved from a public database. The mitochondrial dysfunction genes retrieve from the intersection of the GeneCards and MitoCarta3.0 databases. The limma and WGCNA package were used to obtain the genes related to IS. Feature genes were screened using LASSO, RF, SVM, and diagnostic models were constructed using NeighborMethod, NeuralNet, and BayesMethod. 3548 differentially expressed genes (DEGs) (1538 upregulated, 2010 downregulated) were identified in IS patients when compared to controls. WGCNA analysis yielded 10 IS-related modules containing 1643 genes. The intersection of DEGs, module genes, and mitochondrial dysfunction genes yielded 100 mitochondrial dysfunction genes associated with IS. These genes collectively regulate biological processes like mitochondrial ATP synthesis coupled electron transport and respiratory electron transport chain, and participate in IS-associated signaling pathways such as reactive oxygen species and oxidative phosphorylation. Further machine learning methods identified 4 feature genes, including MCL1, MRPL46, MTX3 and RNASEH1. These four genes exhibited robust diagnostic potential in the merged dataset (all AUC > 0.7). The machine learning models achieved AUC values of 0.814 (NeighborMethod), 0.852 (NeuralNet), and 0.842 (BayesMethod). External validation using an independent cohort confirmed that all models maintained high diagnostic accuracy (AUC range: 0.730–0.783). This study established a multi-gene diagnostic model for IS, identifying novel molecular biomarkers to improve the timeliness and accessibility of IS diagnosis.
期刊介绍:
The Journal of Molecular Neuroscience is committed to the rapid publication of original findings that increase our understanding of the molecular structure, function, and development of the nervous system. The criteria for acceptance of manuscripts will be scientific excellence, originality, and relevance to the field of molecular neuroscience. Manuscripts with clinical relevance are especially encouraged since the journal seeks to provide a means for accelerating the progression of basic research findings toward clinical utilization. All experiments described in the Journal of Molecular Neuroscience that involve the use of animal or human subjects must have been approved by the appropriate institutional review committee and conform to accepted ethical standards.