Andong Chen , Jun Yu , Chenjie Chang , Xiaoyi Lv , Xuguang Zhou , Yuxuan Guo , Enguang Zuo , Min Li , Yujia Ren , Shengquan Liu , Chen Chen , Xiantao Ai , Cheng Chen
{"title":"The Raman spectroscopy combined with selective state-space algorithm for constructing a rapid disease diagnosis model","authors":"Andong Chen , Jun Yu , Chenjie Chang , Xiaoyi Lv , Xuguang Zhou , Yuxuan Guo , Enguang Zuo , Min Li , Yujia Ren , Shengquan Liu , Chen Chen , Xiantao Ai , Cheng Chen","doi":"10.1016/j.chemolab.2025.105375","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, how to construct a rapid disease diagnosis model is still the focus of research in the artificial medical field. Raman spectroscopy is widely used in the medical diagnostic field because of its non-invasive, rapid and highly sensitive properties. However, in Raman spectroscopy, the resonance of multiple functional groups and compounds can result in identical characteristic peaks within the spectrum, which affects the accuracy of this technique in the field of disease diagnosis. Existing studies often focus solely on capturing either local or global information from Raman spectra, potentially causing models to overlook interactions between characteristic peaks or the intricate details within individual peaks of a single spectrum. To address these issues, this paper proposes a medical Raman spectroscopy model, MRSMamba, based on the selective state-space algorithm. The spectral data is first encoded into labeled sequences through a Patch module, which are then input into the Mamba block of the selective state-space algorithm. This model leverages the unique features of selective state-space algorithms to capture detailed local information within each labeled segment while preserving global spectral characteristics, thereby constructing a rapid disease diagnosis model. For the first time, the selective state-space algorithm is applied to the field of medical Raman spectroscopy, with modifications tailored for Raman data. During the encoding phase, the paper also introduces an innovative sequence labeling module designed specifically for the Mamba framework. Experiments using the proposed MRSMamba model were conducted on multiple disease datasets, including thyroid benign and malignant tumor datasets, cancer datasets, and autoimmune disease datasets. We evaluated MRSMamba on a binary classification task involving 99 cases of benign and malignant thyroid tumors, achieving an Accuracy of 0.9286, a recall of 0.9286, a Specificity of 0.9285, and an F1-score of 0.9286. MRSMamba demonstrated a 3.57 % higher accuracy compared to the MLP model. Additionally, the model was tested on a four-class cancer classification task, achieving an Accuracy of 0.7813, a Recall of 0.7042, a Specificity of 0.9165, and an F1-score of 0.7381. MRSMamba outperformed the standalone encoding module PACE by 6.25 % in terms of accuracy. Furthermore, the model was evaluated on an autoimmune disease classification task, achieving an accuracy of 0.9813 and an F1-score of 0.9793. These results highlight the exceptional performance of MRSMamba in the field of rapid disease diagnosis using Raman spectroscopy, demonstrating significant practical application potential.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"261 ","pages":"Article 105375"},"PeriodicalIF":3.7000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925000607","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, how to construct a rapid disease diagnosis model is still the focus of research in the artificial medical field. Raman spectroscopy is widely used in the medical diagnostic field because of its non-invasive, rapid and highly sensitive properties. However, in Raman spectroscopy, the resonance of multiple functional groups and compounds can result in identical characteristic peaks within the spectrum, which affects the accuracy of this technique in the field of disease diagnosis. Existing studies often focus solely on capturing either local or global information from Raman spectra, potentially causing models to overlook interactions between characteristic peaks or the intricate details within individual peaks of a single spectrum. To address these issues, this paper proposes a medical Raman spectroscopy model, MRSMamba, based on the selective state-space algorithm. The spectral data is first encoded into labeled sequences through a Patch module, which are then input into the Mamba block of the selective state-space algorithm. This model leverages the unique features of selective state-space algorithms to capture detailed local information within each labeled segment while preserving global spectral characteristics, thereby constructing a rapid disease diagnosis model. For the first time, the selective state-space algorithm is applied to the field of medical Raman spectroscopy, with modifications tailored for Raman data. During the encoding phase, the paper also introduces an innovative sequence labeling module designed specifically for the Mamba framework. Experiments using the proposed MRSMamba model were conducted on multiple disease datasets, including thyroid benign and malignant tumor datasets, cancer datasets, and autoimmune disease datasets. We evaluated MRSMamba on a binary classification task involving 99 cases of benign and malignant thyroid tumors, achieving an Accuracy of 0.9286, a recall of 0.9286, a Specificity of 0.9285, and an F1-score of 0.9286. MRSMamba demonstrated a 3.57 % higher accuracy compared to the MLP model. Additionally, the model was tested on a four-class cancer classification task, achieving an Accuracy of 0.7813, a Recall of 0.7042, a Specificity of 0.9165, and an F1-score of 0.7381. MRSMamba outperformed the standalone encoding module PACE by 6.25 % in terms of accuracy. Furthermore, the model was evaluated on an autoimmune disease classification task, achieving an accuracy of 0.9813 and an F1-score of 0.9793. These results highlight the exceptional performance of MRSMamba in the field of rapid disease diagnosis using Raman spectroscopy, demonstrating significant practical application potential.
期刊介绍:
Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines.
Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data.
The journal deals with the following topics:
1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.)
2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered.
3) Development of new software that provides novel tools or truly advances the use of chemometrical methods.
4) Well characterized data sets to test performance for the new methods and software.
The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.