An improved cancer diagnosis algorithm for protein mass spectrometry based on PCA and a one-dimensional neural network combining ResNet and SENet†

IF 3.6 3区 化学 Q2 CHEMISTRY, ANALYTICAL
Analyst Pub Date : 2024-11-04 DOI:10.1039/D4AN00784K
Liang Ma, Wenqing Gao, Xiangyang Hu, Dongdong Zhou, Chenlu Wang, Jiancheng Yu and Keqi Tang
{"title":"An improved cancer diagnosis algorithm for protein mass spectrometry based on PCA and a one-dimensional neural network combining ResNet and SENet†","authors":"Liang Ma, Wenqing Gao, Xiangyang Hu, Dongdong Zhou, Chenlu Wang, Jiancheng Yu and Keqi Tang","doi":"10.1039/D4AN00784K","DOIUrl":null,"url":null,"abstract":"<p >Cancer is one of the most serious health problems worldwide. Because cancer has no specific symptoms in its early stages, it is often not diagnosed until it is in advanced stages, reducing the likelihood of successful treatment. Therefore, early diagnosis of cancer is a formidable challenge. Mass spectrometry-based proteomics offers a robust technical foundation for cancer diagnosis. However, mass spectrometry data are characterized by high dimensionality, large data volume, and noise interference, which can lead to diagnostic errors in clinical applications. To address this challenge, an improved algorithm combining principal component analysis (PCA) with a convolutional neural network (CNN) algorithm (denoted as PCA-1DSE-ResCNN) was proposed to assist in analyzing high-dimensional mass spectral data. The algorithm initially reduced the dimensionality of the data through the PCA technique. Subsequently, the convolutional neural network algorithm (1DSE-ResCNN) integrating residual blocks and squeeze-and-excitation blocks was used as a classifier. This approach can not only alleviate the issues of overfitting and gradient vanishing caused by deep network layers but also reduce redundant information, enabling the algorithm to effectively learn high-dimensional data features and deal with nonlinear relationships. To validate the effectiveness of the algorithm, the high-dimensional ovarian cancer mass spectrometry dataset was selected as an example to examine its application performance in early diagnosis of ovarian cancer. The experimental results demonstrated that the PCA-1DSE-ResCNN algorithm outperforms other methods in terms of accuracy, specificity, and sensitivity on three high-dimensional ovarian cancer datasets. This study will contribute to the rapid diagnosis and early detection of cancer.</p>","PeriodicalId":63,"journal":{"name":"Analyst","volume":" 23","pages":" 5675-5683"},"PeriodicalIF":3.6000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analyst","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/an/d4an00784k","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer is one of the most serious health problems worldwide. Because cancer has no specific symptoms in its early stages, it is often not diagnosed until it is in advanced stages, reducing the likelihood of successful treatment. Therefore, early diagnosis of cancer is a formidable challenge. Mass spectrometry-based proteomics offers a robust technical foundation for cancer diagnosis. However, mass spectrometry data are characterized by high dimensionality, large data volume, and noise interference, which can lead to diagnostic errors in clinical applications. To address this challenge, an improved algorithm combining principal component analysis (PCA) with a convolutional neural network (CNN) algorithm (denoted as PCA-1DSE-ResCNN) was proposed to assist in analyzing high-dimensional mass spectral data. The algorithm initially reduced the dimensionality of the data through the PCA technique. Subsequently, the convolutional neural network algorithm (1DSE-ResCNN) integrating residual blocks and squeeze-and-excitation blocks was used as a classifier. This approach can not only alleviate the issues of overfitting and gradient vanishing caused by deep network layers but also reduce redundant information, enabling the algorithm to effectively learn high-dimensional data features and deal with nonlinear relationships. To validate the effectiveness of the algorithm, the high-dimensional ovarian cancer mass spectrometry dataset was selected as an example to examine its application performance in early diagnosis of ovarian cancer. The experimental results demonstrated that the PCA-1DSE-ResCNN algorithm outperforms other methods in terms of accuracy, specificity, and sensitivity on three high-dimensional ovarian cancer datasets. This study will contribute to the rapid diagnosis and early detection of cancer.

Abstract Image

Abstract Image

基于 PCA 以及结合 ResNet 和 SENet 的一维神经网络的改进型蛋白质质谱癌症诊断算法。
癌症是全球最严重的健康问题之一。由于癌症在早期没有特异性症状,因此往往要到晚期才会被诊断出来,从而降低了成功治疗的可能性。因此,癌症的早期诊断是一项艰巨的挑战。基于质谱的蛋白质组学为癌症诊断提供了坚实的技术基础。然而,质谱数据具有维度高、数据量大、噪声干扰等特点,在临床应用中可能导致诊断错误。为应对这一挑战,研究人员提出了一种将主成分分析(PCA)与卷积神经网络(CNN)算法相结合的改进算法(称为 PCA-1DSE-ResCNN),以帮助分析高维质谱数据。该算法首先通过 PCA 技术降低数据维度。随后,使用卷积神经网络算法(1DSE-ResCNN)整合残差块和挤压-激发块作为分类器。这种方法不仅能缓解深度网络层带来的过拟合和梯度消失问题,还能减少冗余信息,使算法能有效学习高维数据特征并处理非线性关系。为了验证该算法的有效性,以高维卵巢癌质谱数据集为例,考察其在卵巢癌早期诊断中的应用性能。实验结果表明,PCA-1DSE-ResCNN 算法在三个高维卵巢癌数据集上的准确性、特异性和灵敏度均优于其他方法。这项研究将有助于癌症的快速诊断和早期检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Analyst
Analyst 化学-分析化学
CiteScore
7.80
自引率
4.80%
发文量
636
审稿时长
1.9 months
期刊介绍: The home of premier fundamental discoveries, inventions and applications in the analytical and bioanalytical sciences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信