Farah Anjum, Abdulaziz Alsharif, Maha Bakhuraysah, Alaa Shafie, Md.Imtaiyaz Hassan, Taj Mohammad
{"title":"通过集成机器学习和基因表达谱发现肌萎缩性侧索硬化症的新生物标志物和潜在治疗靶点","authors":"Farah Anjum, Abdulaziz Alsharif, Maha Bakhuraysah, Alaa Shafie, Md.Imtaiyaz Hassan, Taj Mohammad","doi":"10.1007/s12031-025-02340-9","DOIUrl":null,"url":null,"abstract":"<div><p>Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disorder that has multiple factors that make its molecular pathogenesis difficult to understand and its diagnosis and treatment during the early stages difficult to determine. Discovering novel biomarkers in ALS for diagnostic and therapeutic potential has become important. Consequently, bioinformatics and machine learning algorithms are useful for identifying differentially expressed genes (DEGs) and potential biomarkers, as well as understanding the molecular mechanisms and intricacies of diseases such as ALS. To achieve the aim of the present study, six datasets obtained from the Gene Expression Omnibus (GEO) were utilized and analyzed using an integrative bioinformatics and machine learning approach. Log transformation was done during data preprocessing, RMA normalization was performed, and the batch effect was corrected. Differential expression analysis identified 206 DEGs that were significantly associated with different biological processes, including muscle function, energy metabolism, and mitochondrial membrane activity. Functional enrichment analysis highlighted pathways, including those related to prion disease, Parkinson’s disease, and ATP synthesis via chemiosmotic coupling. We employed a multi-step machine learning framework incorporating random forest, LASSO regression, and SVM-RFE to identify robust biomarkers. This approach identified three key genes, <i>CHRNA1</i>, <i>DLG5</i>, and <i>PLA2G4C</i>, which could be explored as promising biomarkers for ALS after further validation. The internal validation, including principal component analysis (PCA) and ROC-AUC analysis, demonstrated strong diagnostic potential of these hub genes, achieving an AUC of 0.96. This work highlights the utility of bioinformatics and machine learning in identifying key genes as biomarkers for diagnostic and therapeutic potential in ALS.</p></div>","PeriodicalId":652,"journal":{"name":"Journal of Molecular Neuroscience","volume":"75 2","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Discovering Novel Biomarkers and Potential Therapeutic Targets of Amyotrophic Lateral Sclerosis Through Integrated Machine Learning and Gene Expression Profiling\",\"authors\":\"Farah Anjum, Abdulaziz Alsharif, Maha Bakhuraysah, Alaa Shafie, Md.Imtaiyaz Hassan, Taj Mohammad\",\"doi\":\"10.1007/s12031-025-02340-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disorder that has multiple factors that make its molecular pathogenesis difficult to understand and its diagnosis and treatment during the early stages difficult to determine. Discovering novel biomarkers in ALS for diagnostic and therapeutic potential has become important. Consequently, bioinformatics and machine learning algorithms are useful for identifying differentially expressed genes (DEGs) and potential biomarkers, as well as understanding the molecular mechanisms and intricacies of diseases such as ALS. To achieve the aim of the present study, six datasets obtained from the Gene Expression Omnibus (GEO) were utilized and analyzed using an integrative bioinformatics and machine learning approach. Log transformation was done during data preprocessing, RMA normalization was performed, and the batch effect was corrected. Differential expression analysis identified 206 DEGs that were significantly associated with different biological processes, including muscle function, energy metabolism, and mitochondrial membrane activity. Functional enrichment analysis highlighted pathways, including those related to prion disease, Parkinson’s disease, and ATP synthesis via chemiosmotic coupling. We employed a multi-step machine learning framework incorporating random forest, LASSO regression, and SVM-RFE to identify robust biomarkers. This approach identified three key genes, <i>CHRNA1</i>, <i>DLG5</i>, and <i>PLA2G4C</i>, which could be explored as promising biomarkers for ALS after further validation. The internal validation, including principal component analysis (PCA) and ROC-AUC analysis, demonstrated strong diagnostic potential of these hub genes, achieving an AUC of 0.96. This work highlights the utility of bioinformatics and machine learning in identifying key genes as biomarkers for diagnostic and therapeutic potential in ALS.</p></div>\",\"PeriodicalId\":652,\"journal\":{\"name\":\"Journal of Molecular Neuroscience\",\"volume\":\"75 2\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Neuroscience\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12031-025-02340-9\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s12031-025-02340-9","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
Discovering Novel Biomarkers and Potential Therapeutic Targets of Amyotrophic Lateral Sclerosis Through Integrated Machine Learning and Gene Expression Profiling
Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disorder that has multiple factors that make its molecular pathogenesis difficult to understand and its diagnosis and treatment during the early stages difficult to determine. Discovering novel biomarkers in ALS for diagnostic and therapeutic potential has become important. Consequently, bioinformatics and machine learning algorithms are useful for identifying differentially expressed genes (DEGs) and potential biomarkers, as well as understanding the molecular mechanisms and intricacies of diseases such as ALS. To achieve the aim of the present study, six datasets obtained from the Gene Expression Omnibus (GEO) were utilized and analyzed using an integrative bioinformatics and machine learning approach. Log transformation was done during data preprocessing, RMA normalization was performed, and the batch effect was corrected. Differential expression analysis identified 206 DEGs that were significantly associated with different biological processes, including muscle function, energy metabolism, and mitochondrial membrane activity. Functional enrichment analysis highlighted pathways, including those related to prion disease, Parkinson’s disease, and ATP synthesis via chemiosmotic coupling. We employed a multi-step machine learning framework incorporating random forest, LASSO regression, and SVM-RFE to identify robust biomarkers. This approach identified three key genes, CHRNA1, DLG5, and PLA2G4C, which could be explored as promising biomarkers for ALS after further validation. The internal validation, including principal component analysis (PCA) and ROC-AUC analysis, demonstrated strong diagnostic potential of these hub genes, achieving an AUC of 0.96. This work highlights the utility of bioinformatics and machine learning in identifying key genes as biomarkers for diagnostic and therapeutic potential in ALS.
期刊介绍:
The Journal of Molecular Neuroscience is committed to the rapid publication of original findings that increase our understanding of the molecular structure, function, and development of the nervous system. The criteria for acceptance of manuscripts will be scientific excellence, originality, and relevance to the field of molecular neuroscience. Manuscripts with clinical relevance are especially encouraged since the journal seeks to provide a means for accelerating the progression of basic research findings toward clinical utilization. All experiments described in the Journal of Molecular Neuroscience that involve the use of animal or human subjects must have been approved by the appropriate institutional review committee and conform to accepted ethical standards.