优化生物医学研究中机器学习增强光谱分析的超特征选择

IF 4.6 2区 化学 Q1 SPECTROSCOPY
Jizhou Zhong , Hany M. Elsheikha , Ka Lung Andrew Chan
{"title":"优化生物医学研究中机器学习增强光谱分析的超特征选择","authors":"Jizhou Zhong ,&nbsp;Hany M. Elsheikha ,&nbsp;Ka Lung Andrew Chan","doi":"10.1016/j.saa.2025.126639","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Machine-learning-powered label-free infrared spectroscopic methods offer significant potential for diagnostic and biomedical applications. However, their applications have been limited by spectral noise, where critical features are often obscured by overlapping bands and data redundancy. Although various feature selection methods have been proposed, many suffer from limitations in consistency and interpretability. To address these challenges, we introduce a novel multi-model machine learning approach that integrates five distinct algorithms to identify a set of “super-features”—spectral features consistently deemed significant across all models.</div></div><div><h3>Principal results</h3><div>This novel workflow outperforms traditional algorithms, achieving superior classification accuracy (&gt;99%) in distinguishing infected from healthy cells, despite using fewer spectral features. To ensure robustness and generalizability, we developed a comprehensive validation strategy that includes independent classifier evaluations, label randomization, and unsupervised analyses. Importantly, the identified super-features accurately differentiated infection states across multiple time points and enhanced the biological interpretability of infection-associated biochemical changes.</div></div><div><h3>Conclusions</h3><div>These findings highlight the potential of advanced multi-model feature selection techniques to enhance the diagnostic power of spectroscopic data in biomedical research, offering high accuracy and valuable biological insights into infection progression.</div></div>","PeriodicalId":433,"journal":{"name":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","volume":"344 ","pages":"Article 126639"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing super-feature selection for machine learning-enhanced spectroscopic analysis in biomedical research\",\"authors\":\"Jizhou Zhong ,&nbsp;Hany M. Elsheikha ,&nbsp;Ka Lung Andrew Chan\",\"doi\":\"10.1016/j.saa.2025.126639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><div>Machine-learning-powered label-free infrared spectroscopic methods offer significant potential for diagnostic and biomedical applications. However, their applications have been limited by spectral noise, where critical features are often obscured by overlapping bands and data redundancy. Although various feature selection methods have been proposed, many suffer from limitations in consistency and interpretability. To address these challenges, we introduce a novel multi-model machine learning approach that integrates five distinct algorithms to identify a set of “super-features”—spectral features consistently deemed significant across all models.</div></div><div><h3>Principal results</h3><div>This novel workflow outperforms traditional algorithms, achieving superior classification accuracy (&gt;99%) in distinguishing infected from healthy cells, despite using fewer spectral features. To ensure robustness and generalizability, we developed a comprehensive validation strategy that includes independent classifier evaluations, label randomization, and unsupervised analyses. Importantly, the identified super-features accurately differentiated infection states across multiple time points and enhanced the biological interpretability of infection-associated biochemical changes.</div></div><div><h3>Conclusions</h3><div>These findings highlight the potential of advanced multi-model feature selection techniques to enhance the diagnostic power of spectroscopic data in biomedical research, offering high accuracy and valuable biological insights into infection progression.</div></div>\",\"PeriodicalId\":433,\"journal\":{\"name\":\"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy\",\"volume\":\"344 \",\"pages\":\"Article 126639\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1386142525009461\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SPECTROSCOPY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386142525009461","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0

摘要

目的:机器学习驱动的无标签红外光谱方法在诊断和生物医学应用方面具有巨大的潜力。然而,它们的应用受到频谱噪声的限制,其中关键特征通常被重叠的频带和数据冗余所掩盖。虽然提出了各种特征选择方法,但许多方法在一致性和可解释性方面存在局限性。为了应对这些挑战,我们引入了一种新的多模型机器学习方法,该方法集成了五种不同的算法,以识别一组“超级特征”——在所有模型中一致认为重要的光谱特征。该新工作流程优于传统算法,尽管使用较少的光谱特征,但在区分感染细胞和健康细胞方面实现了更高的分类准确率(99%)。为了确保稳健性和泛化性,我们开发了一个全面的验证策略,包括独立分类器评估、标签随机化和无监督分析。重要的是,鉴定出的超级特征准确地区分了多个时间点的感染状态,并增强了感染相关生化变化的生物学可解释性。结论这些发现突出了先进的多模型特征选择技术在生物医学研究中提高光谱数据诊断能力的潜力,为感染进展提供了高精度和有价值的生物学见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Optimizing super-feature selection for machine learning-enhanced spectroscopic analysis in biomedical research

Optimizing super-feature selection for machine learning-enhanced spectroscopic analysis in biomedical research

Purpose

Machine-learning-powered label-free infrared spectroscopic methods offer significant potential for diagnostic and biomedical applications. However, their applications have been limited by spectral noise, where critical features are often obscured by overlapping bands and data redundancy. Although various feature selection methods have been proposed, many suffer from limitations in consistency and interpretability. To address these challenges, we introduce a novel multi-model machine learning approach that integrates five distinct algorithms to identify a set of “super-features”—spectral features consistently deemed significant across all models.

Principal results

This novel workflow outperforms traditional algorithms, achieving superior classification accuracy (>99%) in distinguishing infected from healthy cells, despite using fewer spectral features. To ensure robustness and generalizability, we developed a comprehensive validation strategy that includes independent classifier evaluations, label randomization, and unsupervised analyses. Importantly, the identified super-features accurately differentiated infection states across multiple time points and enhanced the biological interpretability of infection-associated biochemical changes.

Conclusions

These findings highlight the potential of advanced multi-model feature selection techniques to enhance the diagnostic power of spectroscopic data in biomedical research, offering high accuracy and valuable biological insights into infection progression.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.40
自引率
11.40%
发文量
1364
审稿时长
40 days
期刊介绍: Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy (SAA) is an interdisciplinary journal which spans from basic to applied aspects of optical spectroscopy in chemistry, medicine, biology, and materials science. The journal publishes original scientific papers that feature high-quality spectroscopic data and analysis. From the broad range of optical spectroscopies, the emphasis is on electronic, vibrational or rotational spectra of molecules, rather than on spectroscopy based on magnetic moments. Criteria for publication in SAA are novelty, uniqueness, and outstanding quality. Routine applications of spectroscopic techniques and computational methods are not appropriate. Topics of particular interest of Spectrochimica Acta Part A include, but are not limited to: Spectroscopy and dynamics of bioanalytical, biomedical, environmental, and atmospheric sciences, Novel experimental techniques or instrumentation for molecular spectroscopy, Novel theoretical and computational methods, Novel applications in photochemistry and photobiology, Novel interpretational approaches as well as advances in data analysis based on electronic or vibrational spectroscopy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信